Ero Carrera (ero) <ero carrera gmail com> |
Wednesday, November 21 2007 22:44.00 CST |
The other day I was talking with a friend and the discussion went into when certain anti-disassembly, anti-debug, etc. techniques might have appeared. Thats bound to be difficult because tricks are usually simultaneously discovered by different people.
So I though, a trick will usually be regarded as "common" once it gets implemented in some packer, as those try to make analysis difficult and will attempt to embedded whichever tricks are good/popular within the underground at the time in order to make the reverse engineering process as cumbersome as possible. Therefore if I could somehow place packers in time Id have a starting point...
That led me to remember about Google Groups. Its possible to make queries restricted to date ranges and the archives go back to 1981. I quickly put together a script to scan with a one-month window through 1981 to 2007 for a set of popular packers.
The most painful part of the whole process was to fool Google... they sure do not like robots... whenever they get a bunch of very simply automated queries theyll server back a "403 Forbidden" telling queries look like coming from a virus or spyware app... But my script is good, its no evil spyware... so I got into the mood of working my way around the checks. I needed to do quite some queries (> 10K) so I better make it believe Im not a robot. Besides finding the right timing for the queries (too often will make Google sad) I had to distribute the search over a few hosts, randomize headers and User-Agents and the query itself (just throw in some randomized, "orthogonal" (nothing to do with your query) search terms). After that the script was good to go...
So, after mining the news groups for popular packer names ( the search string was, most of the time, " exe" plus the "randomized" terms ) I got a cute small data set to throw into Mathematica...








The results will have some inaccuracies, as its possible some of the terms appeared in some news post not related to the packers. Yet I think they look plausible. When the volume of hits is high enough or constant over time it feels like it would indicate the approximate release date of the packer in question, or at least the first public discussion about it which, I would tend to think, will not necessarily be too far apart. If someone can either corroborate or refute the data Ill be glad to hear.
I also did some test overlaying virus release times in order to try to spot correlations between big outbreaks and news-posts about packers, but I couldnt see anything particularly significant. |