From APK to Source: Complete Android Reverse Engineering Workflow

Dec 13, 2025 • android,reverse-engineering,apktool,dex2jar,cfr,jadx,pentest,mobile

Series: Android Pentesting (Part 5 of 5)

Android Pentest Lab Build
Android SSL Pinning Bypass with Burp Suite
Frida Method Hooking for Android App Analysis
Hooking Native Libraries with Frida Interceptor
From APK to Source: Complete Android Reverse Engineering Workflow

This post covers my end-to-end workflow for reverse engineering Android applications. When you’re pentesting a mobile app, static analysis of the decompiled source often reveals more than dynamic testing alone — hardcoded API keys, authentication logic, hidden endpoints, and vulnerable code patterns.

I covered Java decompilation with CFR in a previous post. This is the Android-specific workflow that gets you from an APK file to readable Java source code.

The Toolchain

You’ll need a few tools. Here’s what I use:

apktool — Decodes APK resources and disassembles DEX to smali
dex2jar — Converts DEX bytecode to Java class files
CFR — Decompiles class files to Java source
jadx — All-in-one alternative (DEX straight to Java)
unzip — APKs are just ZIP files

Install these on Kali/Debian:

sudo apt install apktool dex2jar unzip
wget https://www.benf.org/other/cfr/cfr-0.152.jar

For jadx, grab a release from GitHub:

wget https://github.com/skylot/jadx/releases/download/v1.4.7/jadx-1.4.7.zip
unzip jadx-1.4.7.zip -d jadx

Getting the APK

If you have the app installed on a device or emulator, pull the APK with adb:

# Find the package name
adb shell pm list packages | grep -i appname

# Get the APK path
adb shell pm path com.example.app
# Output: package:/data/app/com.example.app-1/base.apk

# Pull it
adb pull /data/app/com.example.app-1/base.apk target.apk

For split APKs (common on newer Android versions):

adb shell pm path com.example.app
# package:/data/app/com.example.app-1/base.apk
# package:/data/app/com.example.app-1/split_config.arm64_v8a.apk
# package:/data/app/com.example.app-1/split_config.en.apk

# Pull all of them
adb pull /data/app/com.example.app-1/ ./app_splits/

Understanding APK Structure

Before diving in, it helps to understand what’s inside an APK:

target.apk
├── AndroidManifest.xml      <- App config, permissions, components (binary XML)
├── classes.dex              <- Compiled Dalvik bytecode
├── classes2.dex             <- Additional DEX (multidex apps)
├── resources.arsc           <- Compiled resources
├── res/                     <- Resource files (layouts, drawables, etc.)
├── assets/                  <- Raw assets (often interesting files here)
├── lib/                     <- Native libraries (.so files)
│   ├── arm64-v8a/
│   ├── armeabi-v7a/
│   └── x86_64/
└── META-INF/                <- Signing info

The code lives in classes.dex. That’s what we need to decompile.

Method 1: The Quick Way (jadx)

If you just want to see the source fast, jadx does everything in one shot:

jadx -d output_dir target.apk

This produces:

output_dir/
├── resources/
│   ├── AndroidManifest.xml    <- Decoded, readable XML
│   └── res/                   <- Decoded resources
└── sources/
    └── com/
        └── example/
            └── app/           <- Decompiled Java source

jadx handles DEX-to-Java directly without intermediate steps. The output is usually good enough for analysis. I use this when I need quick results.

For a GUI with search and cross-references:

jadx-gui target.apk

Method 2: The Detailed Way (apktool + dex2jar + CFR)

Sometimes jadx struggles with obfuscated apps or you need more control over the process. The traditional toolchain gives you more options.

Step 1: Decode with apktool

apktool d target.apk -o apktool_output

This gives you:

apktool_output/
├── AndroidManifest.xml    <- Decoded, readable XML
├── apktool.yml            <- apktool metadata
├── original/              <- Original META-INF
├── res/                   <- Decoded resources
├── smali/                 <- Disassembled DEX (smali code)
└── smali_classes2/        <- Additional DEX files

The smali output is useful for:

Patching the app (modify smali, rebuild with apktool b)
Understanding obfuscated code flow
Finding strings that don’t decompile well

Step 2: Convert DEX to JAR

Extract the DEX files and convert them:

# Extract DEX files from APK
unzip -j target.apk "*.dex" -d dex_files/

# Convert each DEX to JAR
d2j-dex2jar dex_files/classes.dex -o classes.jar
d2j-dex2jar dex_files/classes2.dex -o classes2.jar

Step 3: Decompile with CFR

Now use CFR to get Java source:

java -jar cfr-0.152.jar classes.jar --outputdir ./decompiled
java -jar cfr-0.152.jar classes2.jar --outputdir ./decompiled

This gives you a clean directory structure matching the package hierarchy.

Dealing with Obfuscation

Most production apps use ProGuard or R8 for obfuscation. You’ll see:

Classes renamed to a.class, b.class, etc.
Methods named a(), b(), c()
Meaningless package names like com.a.b.c

Here’s how to work with it:

1. Find Entry Points

Start with the AndroidManifest.xml. Activities, Services, and Receivers are listed with their full class names. These are your entry points:

<activity android:name="com.example.app.MainActivity">
    <intent-filter>
        <action android:name="android.intent.action.MAIN"/>
    </intent-filter>
</activity>
<service android:name="com.example.app.sync.SyncService"/>
<receiver android:name="com.example.app.PushReceiver"/>

Even if the code is obfuscated, these class names are preserved.

2. Search for Strings

Grep for interesting strings in the decompiled source:

# API endpoints
grep -r "https://" decompiled/ | grep -v ".png\|.jpg"

# Hardcoded keys
grep -rE "(api[_-]?key|secret|token|password)" decompiled/

# Crypto operations
grep -rE "(AES|RSA|DES|encrypt|decrypt)" decompiled/

# SQL queries
grep -r "SELECT\|INSERT\|UPDATE" decompiled/

# SharedPreferences (local storage)
grep -r "getSharedPreferences\|putString\|getString" decompiled/

3. Trace from Known Code

Find unobfuscated library code and trace into the app code. For example, Retrofit API definitions often reveal endpoint structures:

public interface ApiService {
    @POST("api/v1/login")
    Call<LoginResponse> login(@Body LoginRequest request);
    
    @GET("api/v1/user/{id}")
    Call<User> getUser(@Path("id") String userId);
}

Extracting Native Libraries

Some apps put sensitive logic in native code. The .so files are in lib/:

unzip -j target.apk "lib/*" -d native_libs/

For basic analysis:

# List exported functions
nm -D native_libs/arm64-v8a/libnative.so

# Search for strings
strings native_libs/arm64-v8a/libnative.so | grep -i key

# Disassemble with objdump
objdump -d native_libs/arm64-v8a/libnative.so > disasm.txt

For serious native reversing, load them into Ghidra or IDA.

Automation Script

Here’s a script that does the full workflow:

#!/bin/bash
# apk_decompile.sh - Full APK decompilation workflow

APK=$1
OUTPUT=${2:-"decompiled_$(basename $APK .apk)"}

if [ -z "$APK" ]; then
    echo "Usage: $0 <apk_file> [output_dir]"
    exit 1
fi

mkdir -p "$OUTPUT"

echo "[*] Running apktool..."
apktool d "$APK" -o "$OUTPUT/apktool" -f

echo "[*] Extracting DEX files..."
unzip -jo "$APK" "*.dex" -d "$OUTPUT/dex"

echo "[*] Converting DEX to JAR..."
for dex in "$OUTPUT/dex/"*.dex; do
    jar_name=$(basename "$dex" .dex).jar
    d2j-dex2jar "$dex" -o "$OUTPUT/jar/$jar_name" 2>/dev/null
done

echo "[*] Decompiling with CFR..."
for jar in "$OUTPUT/jar/"*.jar; do
    java -jar cfr-0.152.jar "$jar" --outputdir "$OUTPUT/java" --silent true
done

echo "[*] Running jadx for comparison..."
jadx -d "$OUTPUT/jadx" "$APK" --no-res 2>/dev/null

echo "[*] Done. Output in $OUTPUT/"
echo "    apktool/  - Decoded resources and smali"
echo "    java/     - CFR decompiled source"
echo "    jadx/     - jadx decompiled source"

What to Look For

Once you have the source, here’s what I hunt for during an assessment:

Authentication & Authorization

How are tokens stored? (SharedPreferences, SQLite, files)
Is certificate pinning implemented? How?
Are there hardcoded credentials or API keys?

Data Storage

What’s stored locally? Is it encrypted?
Are there SQLite databases with sensitive data?
Check the assets/ folder for config files

Network Communication

What endpoints does the app talk to?
Is there a debug/staging server hardcoded?
Are there websocket or MQTT connections?

Cryptography

Is crypto implemented correctly?
Are there hardcoded keys or IVs?
Is the app using weak algorithms (MD5, DES, ECB mode)?

Hidden Functionality

Debug modes or admin features
Feature flags that enable extra functionality
Test accounts or bypass codes

Next Steps

Once you’ve found interesting code paths through static analysis, you can:

Use Frida to hook methods at runtime
Bypass SSL pinning to intercept traffic
Patch the smali and rebuild the APK
Hook native functions for deeper analysis

The combination of static and dynamic analysis gives you the full picture of how an app works and where it’s vulnerable.

If you enjoyed this post please consider subscribing to the feed!