Tuesday, December 1, 2015

Syncing Android Open Source Project Code Through Unreliable Connection

Since AOSP code is really big, there are multiple problems that can arise when syncing using slow and unreliable Internet connection. For this, the regular old "repo sync -j4" wouldn't work well.

So what are the issues?

1. Repo tries to download each of the branches available for each of the projects
This behavior is handy, but is undesired in a situation where Internet connection is slow.

To get repo to only sync the defined branch, use -c flag. For example repo sync -j4 -c

2. When a project failed to be downloaded, the whole sync is aborted.
You have been waiting for your sync for hours and suddenly your connection went down, and so the sync failed. You were hoping that you can just re-run the sync command again and continue where the it left off. Unfortunately, you found that the sync restarts!

To work around this, append "-f" to the repo command. This forces the sync to continue even if one or more projects fails.

3. Syncing with -f gets aborted because the computer is restarted or there is power outage.
Unfortunately, in this situation, repo doesn't partially download each of the projects. Even with -f flag, the projects would only be downloaded to the directory once it finishes fetching all the projects. In other words, when the sync is terminated abruptly, you'll have to restart it from the beginning!

To work around this, sync each of the projects individually using repo sync -j4 -c -f [project_name], where project_name can be found from the manifest file of interest.

I personally created a simple script to dump each of the paths that would be synced, and then did an xarg to sync each of them individually:

a. My scala script to dump each of the paths. I'm sure this can be done way way more efficiently using sed, but I'm just not familiar enough with it nor have the time to learn.

#!/bin/sh
exec scala "$0" "$@"
!#

import java.io.File;
import java.util.Scanner;
import java.util.List;
import java.util.ArrayList;

def getPath(s: Scanner) : List[String] = {
    val list : List[String] = new ArrayList[String]()

    def getPath(path : List[String], s : Scanner) : List[String] = {
        if (!s.hasNextLine) return path
        val line = s.nextLine
        if (line.contains("path=")) {
            val tmp = line.substring(line.indexOf("path=") + "path=".size)
            val tmp2 = tmp.substring(1, tmp.indexOf("\"", 1))
            if (!tmp2.equals("")) path.add(tmp2)
            return getPath(path, s)
        } else {
            return getPath(path, s)
        }
    }

    return getPath(list, s)
}

var f = new File("/home/aharijanto/tmp/cm12path.txt")
var s = new Scanner(f)
val l = getPath(s)
val it = l.iterator

while (it.hasNext) {
    println(it.next)
}

Save the script about as repopath.scala, and then run it like:
./repopath.scala > ~/tmp/path.txt

2. Go to the directory where repo is init-ed
cat ~/tmp/path.txt | xargs -I xxx sh -c 'echo xxx > ~/tmp/log.txt ; repo sync -j4 -c -f xxx'

This way, in case of the sync being aborted, you just need to check on the log file and continue from the project that failed.