jsgrep is like grep, but instead of working on characters, it works on a JavaScript token stream. It’s on github at https://github.com/sfrancisx/jsgrep.
I’m a front-end engineer on Yahoo! Mail. I wrote jsgrep for my own use in mid-2010, and I’ve been (lackadaisically) promoting jsgrep within the Yahoo! Mail team since late 2010. Honestly, I have not met with much success. “It works on tokens instead of characters” seems like a subtle distinction, and it doesn’t really convey how powerful jsgrep can be. I don’t like to blow my own horn, but I have to say that it is just about the coolest command line tool I’ve ever used. I probably use it an average of 20 times a day.
Yahoo! Mail is a big program, with a lot of people working on it. The portion that I work on has 9179 functions in about a quarter million lines of code (well, 117,064 lines have code on them. The rest is whitespace or comments.) I am very familiar with a small part of the code, slightly familiar with a slightly larger part of the code, and just about completely clueless about most of it. Unfortunately, I might have to work on a bug that occurs anywhere. Being able to search the code quickly and easily is extremely important.
Enter jsgrep
Part of what makes jsgrep so cool is that it’s so convenient to use. I have defined file sets for different areas of Mail’s code. My default file set includes only the code that I work in (including the parts that I’m clueless about). By far the most common thing I do with jsgrep is to find function definitions or function calls, so I’ve defined macros for both (along with 29 other macros that I use less often.) Compare jsgrep to grep when finding a function called onUpdate:
jsgrep
$ jsgrep F:onUpdate
grep
$ find ~/dev/yahoo/ymail/src -name *.js -exec grep -r -E '([^A-Za-z0-9]onUpdate[ ]*(=|:)[ ]*(new)?[ ]*function)|(function[ ]+onUpdate[^A-Za-z0-9])' {} ";"
It took me a half hour or so to figure out that grep command. It only takes me about 30 seconds to lose my train of thought, so the grep command is essentially useless to me. Simply grepping for onUpdate is a lot easier, but it also returns a lot of stuff, including comments and references to ‘actionUpdate’ and ‘onUpdatesReady’. And, of course, ‘onUpdate’ is not a common string – doing a simple grep for ‘set’ returns 16,570 matches.
Although finding functions is my most common use, I do often use it for more complex tasks. Here are some real life examples:
- Developers occasionally accidentally check in debugger or console.log() statements
- Trailing commas in object initializers break IE 7:
- Our code coverage tool can’t handle for statements that don’t have braces
- I’d like a quick & dirty way to find unused code.
extendsis a future reserved word in JavScript, but most browsers allowed it to be used as an identifier until August of 2011, when some of my code mysteriously broke.- We ship compressed code. It’s not uncommon to reproduce a bug in production, and to know exactly where it’s occurring, but to be unable to find the source. I recently had a problem isolated to
x=c.getAttribute(d) - I don’t know if this is a real life example, but I’m strangely interested in statistical trivia.
$ jsgrep (console.log)|debugger
At the moment, I have 10 console.log() statements and 6 debugger statements in my source. Some of these are local changes, or they’re in debug code, but a couple of them look like they need to be removed.
$ jsgrep ,}
This happens more often than you’d think. When you define an object’s functions inline, the last comma may be hundreds of lines away from the closing brace. Comment out the last function and you’ve created a very hard to see problem.
$ jsgrep 'for (LPAREN) .* C:1 (!LBRACE)'
LPAREN matches the open paren token. The macro is defined as \(. If you type \( on the command line without quoting it, the shell thinks you’re escaping the parenthesis for it, and it removes the backslash. To get the shell to pass \( to jsgrep, you have to type \\\( (or quote \() on the command line. Using this macro will always work, and it just seems easier.
C:1 is a cool feature with an awful syntax. It matches the closing paren, bracket or brace for capture #1.
$ jsgrep -m -l- -n- F:NAME | sort | uniq > all_funcs
$ jsgrep -m -l- -n- NAME LPAREN | sort | uniq > called_funcs
$ diff called_funcs all_funcs
This is quick and very dirty. The task got de-prioritized before I figured out if the results were too noisy to be useful.
$ jsgrep class|const|enum|export|extends|import|super
class, const, etc. are also future reserved words.
$ jsgrep NAME=NAME.getAttributes LPAREN NAME
There were several matches, but it happened to be obvious which one I wanted. If it hadn’t been obvious, I would have added a few more tokens.
6295 of Mail’s 9179 functions are named, although 1122 of the named functions are anonymous functions assigned to a variable. The most commonly called function name is get (7.1% of function calls), followed by one (5.8%), set (3.3%), push (3.2%) and on (2.6%). We have 10 functions named get in Mail’s code, and there are another 30 in YUI. 33 functions take 5 or more parameters.
The most common token in our code is “.” (18.5% of all tokens) followed by “(” and “)” (tied (whew!) at 15% of tokens). The most common name token is _this (3.3% of tokens and then 10th most common token overall).
We have 17,149 bytes of code in 484 log statements, and another 3,412 bytes in 88 assertions.
We have 266 calls to setTimeout or Y.later. 38 of them have a timeout of 0.
I’ll immediately snatch your rss as I can not in finding your email subscription link or e-newsletter service. Do you’ve any? Please allow me know so that I may just subscribe. Thanks.
My (just enabled) rss feed is at http://jsgrep.com/wordpress/?feed=rss2.
An impressive share! I’ve just forwarded this onto a colleague who has been conducting a little research on this. And he in fact ordered me lunch due to the fact that I discovered it for him… lol. So allow me to reword this…. Thank YOU for the meal!! But yeah, thanx for spending some time to discuss this issue here on your site.
Thanks. I’m glad it helped (and fed) you :=).
Currently it sounds like WordPress is the top blogging platform out there right now. (from what I’ve read) Is that what you are using on your blog?
Yes, I’m using WordPress.
Greetings! Very useful advice in this particular article! It’s the little changes that will make the greatest changes. Many thanks for sharing!
Great blog right here! Also your site so much up very fast! What web host are you the use of? Can I get your affiliate link in your host? I wish my web site loaded up as fast as yours lol
My host is Go Daddy – http://www.godaddy.com.
I’m usually to running a blog and i really recognize your content. The article has actually peaks my interest. I am going to bookmark your web site and keep checking for brand new information.
YHyv4d Wow, great article.Much thanks again. Cool.
I’ve said that least 2426184 times. SCK was here
I’m glad i came across jsgrep.com
apple iphone 5
Hurrah, that’s what I was exploring for, what a material! present here at this web site, thanks admin of this web page.
Really good blog. keep going!
Hello There. I found your blog using msn. This is a really well written article.
I’ll make sure to bookmark it and return to read more of your useful info. Thanks for the post. I’ll definitely comeback.
jsgrep.com is awsome dude
youtube marketing
nice post. Must be bookmarked:)
Aw, this was a really good post. Finding the time and actual effort to produce a superb article… but what can I say… I hesitate a whole lot and don’t seem to get anything done.
I have read a few good stuff here. Definitely price bookmarking for revisiting. I wonder how much effort you put to make this type of great informative web site.
Excellent job. People should read this.