Nothing ventured, nothing gained home archives
08 Dec 2008 » How to use Method#to_ruby

In Ruby, there’s a handy dandy way to get a string representation of a method.

hoktauri$ cat method_to_ruby.rb
def bacon
  "chunky"
end

if $0 == __FILE__
  require 'rubygems'
  require 'ruby2ruby'
  require 'parse_tree_extensions'
  require 'pp'
  pp method(:bacon).to_ruby
end

hoktauri$ ruby method_to_ruby.rb
"def bacon\n \"chunky\"\nend"

Uses the ruby2ruby and ParseTree gems.

08 Dec 2008 » Understanding operator precedence using ParseTree

A small tribute to Guy Decoux, an early Ruby programmer who once walked the Ruby parse tree to answer a simple operator precedence question posed by Matz.

Read Matz’s question and Guy’s answer. Today, we can answer the same question with the following code:

hoktauri$ cat guy.rb
if $0 == __FILE__
    require 'rubygems'
    require 'parse_tree'
    require 'parse_tree_extensions'
    require 'yaml'
    pt = ParseTree.new
    y pt.parse_tree_for_string("a b c, d")
end

hoktauri$ ruby guy.rb
- - -
- - :fcall
  - :a
  - - :array
    - - :fcall
      - :b
      - - :array
        - - :vcall
          - :c
        - - :vcall
          - :d

Guy passed away earlier this year. His genius was admired and he will be missed by the entire Ruby community.

One more gotcha on operator precedence.

According to the Programming Ruby chapter on expressions, the difference between the && (double ampersand) and and operators, is precedence ordering: && has higher binding than and.

>> false and true ? 'chunky' : 'bacon'
false
>> false && true ? 'chunky' : 'bacon'
"bacon"
>> (false and true) ? 'chunky' : 'bacon'
"bacon"

The or and || (double pipe) operators behave similarly.

These two code snippets also have very different abstract syntax trees. Seeing the difference will also further clarify things.

>> y pt.parse_tree_for_string("false and true ? 'chunky' : 'bacon'")
- - -
- - :and
  - - :false
  - - :if
    - - :true
    - - :str
      - chunky
    - - :str
      - bacon


>> y pt.parse_tree_for_string("false && true ? 'chunky' : 'bacon'")
- - -
- - :if
  - - :and
    - - :false
    - - :true
  - - :str
    - chunky
  - - :str
    - bacon

Uses the ruby2ruby and ParseTree gems.

13 Nov 2008 » Gisting, an early preview of MapReduce in Ruby

Earlier this year I gave a talk on ruby2ruby at my local Phoenix Users group. I followed up with a longer and more technical talk at RubyConf 2008. Not wanting to show up with a lack of code, I demonstrated the power of ruby2ruby by writing a couple of programs.

One of the programs I wrote is called Gisting, which is an open source, Ruby implementation of Google’s MapReduce framework which simplifies writing distributed data intensive applications.

inputs = args
 spec = Gisting::Spec.new
 inputs.each do |file_input|
   input = spec.add_input
   input.file_pattern = file_input
   input.map do |map_input|
     # 2722  mailbox  2006-05-23 00:08:39
     # 217  -  2006-05-23 15:41:48
     # 1326  www.crazyradiodeals.com  2006-05-23 18:00:30
     # 2722  mailbox  2006-05-23 00:08:39
     # 2722  mailbox  2006-05-23 00:08:42
     # 2722  jc whitney  2006-05-23 00:25:47  1  http://www.jcwhitney.com
     words = map_input.strip.split("\t")
     Emit(words[1], "1")
   end
 end
 output = spec.output
 output.filebase = "/Volumes/gisting/datasets/output"
 output.num_tasks = 2
 output.reduce do |reduce_input|
   count = 0
   reduce_input.each do |value|
     count += value.to_i
   end
   Emit(count)
 end

 result = MapReduce(spec)
 pp result

After the talk, I got so much positive feedback that I decided to build a releasable version of the software. The software is almost ready for a public release, but before that happens, I’d like to announce an early preview.

There isn’t much documentation available just yet, but I wanted to show just how easy it is to write MapReduce programs with Gisting. Here’s a snippet that performs a Frequency count for the AOL search logs:

Keep in mind that this is an early preview, so I’m well aware that it needs a lot of TLC before I’ll be happy making a 1.0 release, such as:

  • A test suite :(
  • A screencast of Gisting basics
  • A homepage/website with examples and documentation
  • Running Gisting in the clouds (Amazon EC2).

That said, I’m planning to release a gem in a few weeks. In the mean time, I hope you enjoy this early preview of Gisting.

05 Sep 2008 » Chrome's Process Model Explained

Recently, Google released the Chrome web browser, which they describe as being the next step in web browsers for the current gamut of JavaScript intensive web applications. One new feature I’m particularly excited about is process affinity. The online comic describes each tab as a separate running process.

Why is this important?

The short answer is robustness. A web application running in your browser, is a lot like an application running on your operating system, with one important distinction: Modern operating systems1 run applications in their own separate process space, while modern browser2 run web applications in the same process space.

By running applications in separate processes, the OS can terminate a malicious (or poorly written) application without affecting the rest of the OS. The browser, on the other hand, can’t do this. Consequently a single rogue application can suck up mountains of memory and eventually crash your entire browser session, along with every other web application you were using at the time.

Chrome differs by running each web application in a separate process space. By doing this, Chrome–or a user–can terminate a single web application without affecting the other tabs you have open.

Process affinity in Chrome

Chrome’s process model is extremely sophisticated. The web comic only mentions the default behavior, but you can configure Chrome to manage processes differently: one process per web site, or one process per group of connected tabs, or one process for everything.

Process-as-site-instance

By default, there are two main Chrome processes, the Browser and the Renderer. The single Browser process is responsible for transporting messages to and from the Renderer, which in turn is responsible for rendering webpages to the user.

Browser process to Renderer Processes
1 Browser process communicates with N Renderer processes.

Each Renderer process has two threads: one Render thread–which renders web pages, and one IPC thread–which transports data in a thread-safe, non- blocking manner between the Render thread and an IPC counterpart sitting in the Browser process. The Renderer process manages 1 IPC thread and 1 Render thread.

Completely separate visits to the same site are managed by different processes, so if you had two tabs open to mail.google.com, one of them could crash without affecting the other. Chrome treats separate browsing contexts as separate processes.

Process per site instance

If you’re on mail.google.com, and you navigate to hotmail.com, the tab’s underlying process may switch. In this case, Chrome switches your browsing context because you navigated to another site.

If a web page pops up another webpage (via JavaScript), then the sites are considered connected, and managed by the same process. Chrome uses a single Renderer process to handle a browsing context.

This is Chrome’s default behavior and is called process-per-site-instance. It’s intuitive in that your tab count is (more or less) your process count.

Process-per-site

Since multiple tabs can be assigned to a single Renderer process, wouldn’t it be neat if the Renderer process could manage a group of sites?

That’s what process-per-site does. Chrome defines a “site” similarly to the Same Origin Policy with subdomains added into the mix.

For example, in process-per-site mode, mail.google.com, docs.google.com and reader.google.com are all managed by a single Renderer process. If one of those web applications crash, then the responsible Renderer process will crash, thus taking down the entire collection of tabs.

Process per site

Unlike the previous process model, a tab does not imply a separate Renderer process.

Process-per-tab

The third and most intuitive process model is called process-per-tab.

In this model, tabs have their own process but unlike process-per-site-instance and process-per-site, none of the underlying process switching logic is applied.

Each tab has it’s own process for the life of tab, so a tab will never change process even if a user consecutively visits hotmail.com, gmail.com, and ymail.com.

Process per tab

One process per tab, forever.

Single-process

Finally, the fourth and simplest process model is the single-process behavior. You can run Chrome in a mode that combines the Browser and Renderer process into a single process. This makes Chrome behave a lot like the browsers we have today2.

Choose your Process model

Anyway, if you made it this far down, then the take away from all this is that, the various process models define different ways of assigning tabs to processes, therefore your user experience will vary depending on your OS, your browsing behavior, and the websites you frequent.

To use a specific process model, you can launch Chrome with one of the following arguments.

--process-per-site
--process-per-tab
--single-process

If you’re interested in reading more about memory isolation and the challenges in building a browser like Chrome, check out Charlie Reis’ paper on Using Processes to Improve the Reliability of Browser-based Applications. Chrome’s process model is derived from this paper.

Thanks to Ben Smith, and the developers in #chromium (irc.freenode.org) for reading drafts of this article.

Updated: Charlie dives into the reasons for a multi-process architecture browser.


  1. Vista, Linux, Unix, OS X, pretty much anything after Windows 2000.

  2. I’m specifically referring to Firefox 3 and Safari 3, which run in a single process. I’m not familiar with Opera, Konquerer, or Explorer’s process model, so there may already be browsers which do a great job at isolating processes or managing threads (Like Opera, I love Opera).

18 Jul 2008 » OpenRain 1.0

In less than a year, OpenRain went from this:

Doing Ruby
Ruby, ruby, ruby

Playing guitar
I miss the guitar

Exercising
And the Elliptical.. in our conference room

to this:

Seating area
Seating area

Hacking area
Hacking area

Eating area
Eating area

To celebrate, OpenRain is throwing our first open house tonight. In addition to friends and family, this is an open invitation to folks in Phoenix, AZ interested in the design, development, and business of web software.

In other words, you’re invited.

Why a celebration?

Officially, it’s because OpenRain recently moved into a new office in sunny Mesa, AZ, and to commemorate this upgrade, we’re throwing a “1.0 Release Party.”

Unofficially, it’s to celebrate just how far OpenRain has come from two guys programming in the spare bedroom. Personally, I’m delighted with just how much growth we’ve experienced in the last nine months and even though the best is yet to come, it’s important that we take a moment to celebrate our recent successes.

Congratulations to entire OpenRain 1.0 Team.