Skip to content
Snippets Groups Projects
Commit 46dc75d7 authored by Clark Coleman's avatar Clark Coleman
Browse files

Fix for inst xrefs from fixed-call pushes.

parent ac946143
Branches
Tags
No related merge requests found
Pipeline #4141 passed
......@@ -2962,21 +2962,13 @@ STARS_InstructionID_Set_t STARS_IRDB_Instruction_t::GetReferencedInstructionIDs(
}
else if(IsJumpFromFixedCall())
{
Instruction_t* insn=const_cast<Instruction_t*>(GetIRDBInstruction());
RelocationSet_t::iterator rit;
for( rit=insn->getRelocations().begin();
rit!=insn->getRelocations().end();
++rit
)
Instruction_t* insn = const_cast<Instruction_t*>(GetIRDBInstruction());
for (const auto reloc : irdb_insn->getRelocations())
{
Relocation_t* reloc=*rit;
if(reloc->getType()==string("fix_call_fallthrough"))
{
Instruction_t* insn=dynamic_cast<Instruction_t*>(reloc->getWRT());
if(insn)
ret.insert(insn->getBaseID());
const auto insn_wrt = dynamic_cast<Instruction_t*>(reloc->getWRT());
}
if (insn_wrt != nullptr)
ret.insert(insn_wrt->getBaseID());
}
}
......
  • Owner

    @clc

    I don't understand how this is the fix. The jump part of a fixed call should have had the fix-call-fallthrough reloc. I thought you said the push part was broken?

  • Author Maintainer

    The logic in STARS was always that the push is pushing the fall-through address, and so I got the fall-through address from that inst in order to determine the fall-through successor block.

  • Owner

    Right.. that makes sense. But why is the change in the "IsJumpFromFixdCall()" part of the code? I thought that the push wasn't detecting the proper referenced IDs?

  • Owner

    In fact, with this patch, kill_deads is still failing if I disable my change to when to fix calls. So, I don't think this is right...

  • Owner

    I just realized I forgot to tag @clc in those posts. Not sure if he's otherwise getting these comments.

  • Author Maintainer

    That was a copy and paste error. In addition to fixing that in the push code, should I revert the jump code to what it was before?

  • Owner

    @clc, The diff shown above is the only diff in the commit/push. The only changes are in the IsJumpFromFixCalled clause. The Push part shows no diffs... so, I'm not sure what's going on.

  • Author Maintainer

    Right, that was in error. The change was supposed to be in the push part, not the jump part. So, in addition to putting the change in the push part like I intended, should I revert the jump part to depend on "fixed_call_fallthrough" reloc types?

  • Owner

    @clc, I tried putting similar code in the push part, and was still seeing the error. I'm not sure what the "right" answer is. If code was edited/committed in error and not tested, that seems problematic and likely deserves to be undone.

    If you think putting an altered version in the push part is the fix, I'm all for it, but I'm not currently seeing that as fixing the problem. If you've already pulled my fix_call.cpp changes, you may wish to change the "if 0" to an "if 1" and re-test sks.

  • Author Maintainer

    OK, I fixed the copy and paste error and tested with a debug build:

    helix64$ $PSZ ./sks.psexe ./sks.killdeads -c rida -c kill_deads Using Zipr backend. Detected ELF non-PIE executable. Performing step rida [dependencies=mandatory] ...Done. Successful. Performing step pdb_register [dependencies=mandatory] ...Done. Successful. Performing step fill_in_cfg [dependencies=unknown] ...Done. Successful. Performing step fill_in_indtargs [dependencies=unknown] ...Done. Successful. Performing step fix_calls [dependencies=unknown] ...Done. Successful. Performing step kill_deads [dependencies=unknown] ...Done. Successful. Performing step zipr [dependencies=none] ...Done. Successful.

    Now I will re-build without debug and re-test before I commit.

  • Owner

    Is that test with or without the fix_calls change? (I.e., does the fix-call log show that there are fixed calls?)

  • Author Maintainer

    I pulled the fix_calls change yesterday before my first re-test.

  • Owner

    So, that means you aren't seeing any fixed calls in your program and you aren't doing an actual test of that code. You need to look at the change to fix calls and undo it before your testing.

  • Author Maintainer

    OK, now I understand that your fix was a disabling. Sorry. Where is fix_calls?

  • Author Maintainer

    Never mind. Finally found it in ir-builders.

  • Author Maintainer

    When I re-enable the fix_calls code, I get the same problem. This time, it is because the push instruction has no relocations at all, or the only relocation becomes a nullptr:

    2935            if (this->GetIDAOpcode() == STARS_NN_push)
    (gdb)
    2938                    for (const auto reloc : irdb_insn->getRelocations())
    (gdb)
    2940                            const auto insn_wrt =
    dynamic_cast<Instruction_t*>(reloc->getWRT());
    (gdb)
    2942                            if (insn_wrt != nullptr)
    (gdb)
    2938                    for (const auto reloc : irdb_insn->getRelocations())
    (gdb)
    2978            success=true;
    (gdb)
    2979            return ret;
    Edited by Jason Hiser
  • Owner

    It appears there is a relocation, and that (dynamic) casting the getWRT field to an Instruction_t is nullptr. It'd be nice to know the getType() of the reloc, and if getWRT() really returns nullptr or just the cast is nullptr. (e.g., is getWRT(), which returns a BaseObj_t pointer, returning a pointer to something besides an Instruction_t? I can usually find the derived type of an object be printing the (dereferenced) base type pointer in gdb. Gdb prints the vtable pointer, typically with a symbol to the derived type. )

    If the reloc is type "push64", but getwRT is nullptr, that might means that we fixed the call, but could not find an instruction at the fallthrough location. This situation might happen if a no-return call gets fixed or something equally odd. Since this is rare, we should verify that this is really what is happening before just adding code to handle a nullptr in the getWRT field.

    If the reloc is a different type, or getWRT is something besides an Instruction_t or nullptr, then I feel like the reloc is in error and we should investigate how it got that way.

  • Owner

    @clc, forgot to tag again, dangit.

  • Author Maintainer

    I am getting all the emails without tagging. I will try to display the type of the relocation in gdb when I get time, soon.

  • Author Maintainer

    push64 is the relocation type.

    2935            if (this->GetIDAOpcode() == STARS_NN_push)
    (gdb)
    2938                    for (const auto reloc : irdb_insn->getRelocations())
    (gdb)
    2940                            const auto insn_wrt =
    dynamic_cast<Instruction_t*>(reloc->getWRT());
    (gdb) call reloc->getType()
    $3 = {static npos = <optimized out>,
      _M_dataplus = {<std::allocator<char>> = {<__gnu_cxx::new_allocator<char>>
    = {<No data fields>}, <No data fields>}, _M_p = 0xd1d7d68 "push64"}}
    Edited by Jason Hiser
  • Owner

    That seems normal

  • Author Maintainer

    Normal in what sense? Does this represent a failure to match the push to the fixed-call follow-through instruction?

  • Owner

    Normal in the sense that the type of reloc should, in fact, be "push64". The next question to answer is about what getWRT really points at, as mentioned above.

  • Author Maintainer

    I am getting a dump_map to help look at the code. I know that getWRT() cannot be cast as an Instruction_t* (it becomes nullptr when so cast).

  • Owner

    Yes, we established that. The question is whether it's actually null, or erroneously pointing at a non-Instruction_t. It'd be helpful to know which address in the original program the original call was part of.

  • Author Maintainer

    It is a call to a returning function that does not seem unusual:

    0000000000407180 <.text+0x2f40> push   %r14
    0000000000407182 <.text+0x2f42> mov    %rsp,%r14
    0000000000407185 <.text+0x2f45> lea    0x3809d4(%rip),%rdx        #
    000000000078
    7b60 <getservbyname@plt+0x383930>
    000000000040718c <.text+0x2f4c> lea    0x390ec5(%rip),%rax        #
    0000000000798058 <getservbyname@plt+0x393e28>
    0000000000407193 <.text+0x2f53> mov    0x58(%rax),%rax
    0000000000407197 <.text+0x2f57> mov    (%rax),%rax
    000000000040719a <.text+0x2f5a> cmp    $0x1,%rax
    000000000040719e <.text+0x2f5e> je     00000000004071a8 <getservbyname@plt
    +0x2f78>
    00000000004071a0 <.text+0x2f60> mov    (%rax),%rax
    00000000004071a3 <.text+0x2f63> jmp    00000000004071dc <getservbyname@plt
    +0x2fac>
    00000000004071a5 <.text+0x2f65> nopl   (%rax)
    00000000004071a8 <.text+0x2f68> sub    $0x10,%r15
    00000000004071ac <.text+0x2f6c> lea    0x408a45(%rip),%rax        #
    000000000080fbf8 <stderr+0x6848>
    00000000004071b3 <.text+0x2f73> cmp    (%rax),%r15
    00000000004071b6 <.text+0x2f76> jb     000000000040739f <getservbyname@plt
    +0x316f>
    00000000004071bc <.text+0x2f7c> lea    0x8(%r15),%rax
    00000000004071c0 <.text+0x2f80> movq   $0x400,-0x8(%rax)
    00000000004071c8 <.text+0x2f88> lea    0x390e89(%rip),%rbx        #
    0000000000798058 <getservbyname@plt+0x393e28>
    00000000004071cf <.text+0x2f8f> mov    0x60(%rbx),%rbx
    00000000004071d3 <.text+0x2f93> mov    %rbx,(%rax)
    00000000004071d6 <.text+0x2f96> mov    %r14,%rsp
    00000000004071d9 <.text+0x2f99> pop    %r14
    00000000004071db <.text+0x2f9b> retq

    etc.

    The call is:

    0000000000407157 <.text+0x2f17> callq  0000000000407180 <getservbyname@plt
    +0x2f5
    0>
    000000000040715c <.text+0x2f1c> lea    0x36968d(%rip),%rbx        #
    00000000007707f0 <getservbyname@plt+0x36c5c0>

    The fall-through after the call seems like a normal instruction.

    This is the only call to 0x407180 in the objdump.

    Recall that the fixed-call push looks like:

    1a46 push 0x40715c

    So, it has the original address (0x40715c) as the pushed value. The dump_map entries for the fixed-call push and the fall-through address are consecutive, with the fixed-call jump inserted at a later inst ID, as expected:

         1a46     407157     407157     8b140  ffffffff      19c0 push
    0x40715c(call 0x407180 Push part)
         1a48     40715c     40715c      1a4a  ffffffff      19c0 lea rbx, [7 +
    0x7707e9](lea rbx, [7 + 0x36968d])
    Edited by Jason Hiser
  • Owner

    Actually, I misread the fix_calls code. A push64 reloc never has any WRT field (besides nullptr). The target of the fixed call is encoded in the jump instruction's irdb_insn->getTarget() field. and the fallthrough is encoded in the jump instruction's fixed-call-fallthrough relocation's getWRT() field. Thus, I'd say that the relocs on the push and the jump portions of the fixed call are set set properly.

    Apparently, STARS isn't happy with how GetReferencedInstructionIDs is extracting that info... but i'm not sure what the interface for GetReferencedInstructionIDs is. I assume now that the "mystery" of the fixed call part is clear, that you can fix GetReferencedInstructionIDs.

  • Owner

    Also, it's really helpful if you typeset the code dumps in a code block. I've been adding them for you, so they are readable, but if you just cut/paste the raw data, it word wraps it like it was just text. You can use the quote icon in the top of the text editor box to typeset a portion of text as code.

  • Author Maintainer

    Fixing STARS to rely on the jump relocation gets us farther, then crashes due to a lack of robustness when dealing with improper function boundary ID in sks. Testing the fix for that problem and close to committing.

  • Author Maintainer

    Fixes committed. Tested on sks and ffmpeg, both large binaries with function boundary issues, using kill_deads.

0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Please register or to comment