Don G. Park

the sublime abiding

In July 2011, I started an irc bot (donpdonp/neuronirc) because irc provides a simple yet powerful human-to-human and human-to-computer interface. It was built on the principle of loosely coupled, coordinated processes.

The foundation of the coordination is the redis pub/sub mechanism. Messages encoded in JSON are sent over pub/sub channels to interested listeners. There is redis support for nearly every language, therefore an extension to the bot can exist in nearly any language even though the core of the bot is in ruby.

This setup was working well but didn't go quite far enough. While new functionality or 'modules' was now possible in any language, the creation of and launching of new modules required significant administrator intervention. The users or participants in the channel could surely contribute their own useful code to the irc bot and I wanted a way to capture those contributions easily, without administrator intervention

In May of 2012, a change was committed - userspace v8. Thanks to the fantastic ruby module called therubyracer, it was simple to encapsulate a javascript interpreter inside the bot. It has enough sandboxing to make it safe to execute most any user script.

Users in an irc chatroom could now give code to the bot to extend its functionalities. I started to use it all the time because it was simple and fast to add a new command to the bot's inventory. Some script management commands were added to make it easier to list your scripts, delete one of them, and create a script from the contents of a url (usualy a github gist). Script complexity ranged from simple to complex.

Now that there are 20-something scripts inside the neuronbot (called zrobo on #pdxtech/freenode), I forget about scripts that I've wrote in the past. Many of them are periodically executing. That starts to feel like sci-fi. I can read and understand any single script, but the entire system is starting to grow beyond my ability to comprehend.

Scripts listen for messages and messages have types. Most often the type is a new line of text from an irc chatroom. That has a particular type. It was a huge leap forward when a new type of message started to be emitted once a minute. Thats a message indicating a change in time, which enables scripts to run periodically. This month scripts have the ability to emit their own messages. This lets complexity be distributed across multiple scripts. One script can be dedicated to monitoring a feed of some sort, such as the weather or a bitcoin marketplace. When conditions are right, a signal is emitted and another script can take action based on that.

Another area of exploration is to standardize the message format amongst different irc bots, and use an irc channel as a bus for messages between bots.

Published on 17 Apr 2013 at 04:49PM . Tags

0 comments

Modern websites are web services with an html front end. Web services need to be easy to manage and easy to scale. The variety of ways to architect these features is a never ending playground of possibilities. Below is a description of features and current thinking on how to achieve those features.

Growth

Easily scale up or down the group of virtual machines that provides the service. Any one server should be able to be removed without affecting the service. tool: libcloud Add more compute nodes programatically, even across different VPS providers.

All API

There is only the API. The website is an html client that makes ajax calls to it. Ideally the website is static html/css, with javascript logic to make it perform/interact with the user. tools: jquery, dustjs

Resilient API

Rather than handling an HTTP request, the request goes into a queue. Workers on different VMs pull jobs off the queue. Different VMs are united by access to a common distributed datastore. tools: redis, mozilla circus

Published on 18 Mar 2013 at 10:04PM .

0 comments

The bitcoin blockchain experienced a fork due to a subtle bug in the pre-0.8 bitcoin app. The 0.8 version was being used by a majority of the miners. After it had been suggested to increase the maximum blocksize, a large block (#225430) was generated by miners on 0.8. This large block was not processable by pre-0.8 apps. There was a significant percentage of miners on pre-0.8 and that brings us to the first, and core problem - how pre-0.8 handled the error.

Fix 1) Block 225430 was a valid block. It was the error handling of the bitcoin client that caused the problem. The hash was valid for its contents and for its place in the blockchain. The client should have been able to verify that before iteration over the transactions. It was the number of transactions that causes bdb to fail. The client should be changed to compartmentalize an exception that happens after the validity of the block is determined. When this valid block caused an error in processing after validation, the client should treat that as an irrecoverable internal error. The pre-0.8 miners and clients would have shut down and the chain would not have forked.

Fix 2) It is my understanding that all the forks exist in each client's blockchain. If so, put up a large warning that a fork exists and things may not be what they seem. The greatest let-down of this blockchain fork event is that the great promise to merchants was invalidated. Since I've been using bitcoin, the mantra has been that after 3, and at most 6, confirmations, it was statistically impossible to get a false-positive on a transaction. Thats true for a single chain. The large, now infamous, payment made on the 0.8fork that was simply un-done out of thin air when the pre-0.8 chain achieved the most length, showed that iron-clad agreement to be false. If the app could warn the user of the existence of a fork, then they'd have a chance to investigate further.

Published on 15 Mar 2013 at 03:39PM . Tags

0 comments

The future makerspace/coworking place has a name: Base16. The word 'base' is a good description of the purpose of the place. Base 16 is a nice pun/computer reference to hexidecimal.

Its primarily a coworking place. Software-development oriented. With space for the other exciting parts of the tech world currently - microcontrollers, 3d printers, quantified self projects, copter projects. It'll have a couple large dashboard displays. There will be jungle-gym features throughout the space. Movement such as pullups and climbing will be integrated into daily activity.

Published on 10 Mar 2013 at 08:13PM .

0 comments

I'm working on a rails app that stores UTF8 strings. It turns out MYSQL support for UTF8 is for 3 byte characters, while UTF8 is capable of 4 byte characters. The various encodings for the client connection and database can be set correctly and still crash because a 4-byte UTF8 character was sent.

This is the error you'll see

Incorrect string value: '\xF0\x9F\x91\x88' 
ActiveRecord::StatementInvalid: Mysql2::Error: Incorrect string value: '\xF0\x9F\x91\x88'
for column 'text' at row 1: 

MYSQL 5.5.3 added a new character set: utf8mb4 to support 4 byte characters. Also the utf8mb3 alias for utf was created to more accurately represent the encoding. This creates a problem for rails (3.2.12 as of this writing).

The first compatibility problem is the mysql2 driver itself. The current release is 0.3.11, committed on 2011-12-06. Driver support for utf8mb4 was committed on 2011-12-20.

So to even begin using utf8mb4, use the git head version of mysql2.

gem 'mysql2', :git => "https://github.com/brianmario/mysql2.git"

add this to config/database.yml

development:
  adapter: mysql2
  encoding: utf8mb4
  collation: utf8mb4_unicode_ci

If you create a new database, you'll run into this error:

Mysql::Error: Specified key was too long; max key length is 767 bytes:
 CREATE UNIQUE INDEX `unique_schema_migrations` ON `schema_migrations` (`version`)
Which comes from a limitation created by utf8mb4, indexes can be at most 191 chars and schema_migrations is a varchar 255. (see http://dev.mysql.com/doc/refman/5.6/en/charset-unicode-upgrading.html)

If you absolutely have to have 4byte utf8 chars in your text column that was setup with utf8, you can add utf8mb4 support for a single column with the following sql

alter table notes modify text varchar(255) CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci;

That was enough to get me going again. Until then I'd avoid mysql on rails until the index creation code becomes aware of the mysql index limitations with utf8mb4.

Published on 16 Feb 2013 at 04:55PM .

0 comments

I use node.js and bouncy as the web front end to my sites. While node.js is known for its performance, I value the extreme flexibility I get by using javascript. I want to explain how it works and show why you might enjoy using it too.

In addition to the usual hostname based forwarding, I wanted two features and found out I could not get them from the existing popular solutions: nginx, varnish, or haproxy.

  • websocket forwarding
  • SNI support

These things work in nodejs/bouncy, and then it got more amazing as it was trivial to add etag support and in-memory file caching. When the rails parameters exploit came out, it was easy to add specific protection for that, which would have been difficult with other setups.

The following block is bouncy's hello world. Listen on port 80 and forward all requests to a web app on localhost port 2500.

/* node.js/bouncy hello world */
var bouncy = require('bouncy')

bouncy(function (req, bounce) {
  bounce(2500)
}).listen(80)

Feature - hostname based forwarding

First things first, forward a variety of hostnames to different internal ports - a typical vhost setup. Im taking a shortcut with these code examples - assume they're inside the bouncy(function{}) block.

  var host = req.headers.host;
  if (host.match(/site-a.com$/)) {
    bounce(2500)
  }
  if (host.match(/site-b.com$/)) {
    bounce(2600)
  }

Feature - websocket forwarding

Surprise - there's no code for this feature because a websocket forward works just like an HTTP forward. Bouncy will connect the incoming connection directly to the websocket port and not get in the way of the stream at all. Perfect for websocket connections to websocket servers like socket.io or npm's websock library.

I was using nginx for a long time but it was lack of websocket support that got me to look for other options.

Feature - SNI support

This was a big deal to get support for and neither haproxy or varnish supports it (Jan 2013). The current development version of haproxy has support for SNI but I could not get it to work after many attempts of different arrangements of the haproxy config file.

The approach I use here, and there are certainly others, is to have the same .js file that sets up the bouncy on port 80, to setup a second bouncy on port 443. The initial ssl config object gets bouncy into SSL mode. The initial key and cert is bogus because the key and cert are specified by the incoming hostname. This bouncy on 443 will unwrap the SSL and forward the request to bouncy on port 80 which has all the vhost setup, etc.

var ssl = {
    key : fs.readFileSync('/etc/ssl/private/ssl-cert-snakeoil.key'),
    cert : fs.readFileSync('/etc/ssl/private/ssl-cert-snakeoil.crt'),
    SNICallback: sni_select
};

bouncy(ssl, function (req, bounce) {
  bounce(80)
}).listen(443).on('error', errlog);

function sni_select(hostname) {
  // optionally massage hostname to strip www here
   var ssl_path = '/etc/ssl/private/'+hostname
   var creds = {
          key : fs.readFileSync(ssl_path+'/private.key'),
          cert: fs.readFileSync(ssl_path+'/server.crt')
   }
   if(fs.existsSync(ssl_path+'/issuer.pem')) {
     creds.ca = fs.readFileSync('/etc/ssl/private/'+hostname+'/issuer.pem')
   }
  return crypto.createCredentials(creds).context
}

Feature - serve static files

Here's where things get interesting. There is no explicit support in bouncy to serve static files, but because its javascript running in node, its trivial to load files off disk and serve them to the client.

var fs = require('fs')

  if (host.match(/staticfiles.com$/)) {
    var path = "/www/staticfiles.com/public/"+req.url
    if(fs.existsSync(path) && fs.statSync(path).isFile()) {
      var response = bounce.respond()
      fs.readFile(path, function(err,body) {
        response.end(body);
      });
    }
  }

Feature - etag support with ram cache

Wait a second! I dont want to hit the disk on every request for a static file. Okay, lets whip up an in-ram cache of static files, and add etag support while we're at it.

var fs = require('fs')
var crypto = require('crypto')
var mime = require('mime');

var etag_cache = {}

  if (host.match(/staticfiles.com$/)) {
    var path = "/www/staticfiles.com/public/"+req.url
    if(fs.existsSync(path) && fs.statSync(path).isFile()) {
      var response = bounce.respond()
      if(typeof(req.headers["if-none-match"]) == "undefined"  || 
         req.headers["if-none-match"] != etag_cache[req.url]) {
         /* etag miss! */
         fs.readFile(path, function(err,body) {
          var shasum = crypto.createHash('sha1');
          shasum.update(body)
          var etag = shasum.digest('base64')
          response.writeHead(200, {
            'Content-Length': body.length,
            'Etag': etag,
            'Content-Type': mime.lookup(path) });
          response.end(body);
          etag_cache[req.url] = etag
        });
      } else {
        /* etag hit! */
        response.writeHead(304, {
          'Etag': etag_cache[req.url] });
        response.end();
      }
    }
  }  

Using a javascript hash as a cache works great and fast. It should be easy to push the urls and etags to another store like memcache or redis if you prefer.

Feature - SSL only sites

Wait, I only want to serve https to visitors of ssl-only.com! Isnt it a problem to forward to the port 80 bouncy? Its no problem! Have the bouncy on port 80 check the x-forwarded-proto! As a bonus, you give regular http users the 'old 301 response!

  if (host.match(/ssl-only.com$/)) {
    if(req.headers['x-forwarded-proto'] == 'https') {
      bounce(2500)
    } else {
      var response = bounce.respond()
      var url = "https://ssl-only.com"
      response.writeHead(301, {
        'Location': url });
      response.end();
    }
  }

Feature - Rails parameter exploit protection

I love ruby on rails and have a number of rails sites on my server. When the mother of all rails exploits happened, I updated my rails apps, but I also added a level of protection at the web front end since any XML post request was suspicious. Just for fun, lets log what commands were attempted with this exploit.

    if(content_type == "text/xml" || content_type == "application/xml") {
      // stop Rails remote code execution
      fs.writeFileSync('rails_rce.log', host+req.url+"\n");
      req.on("data",function(data){
         fs.appendFileSync('rails_rce.log',data.toString('utf8'))
      })
      return
    }

This was a huge win for easily modifying the behaviour of the web front end in ways not likely to be supported through the configuration files of nginx, haproxy, and varnish.

Wishlist

Ive come into only one real problem with this setup. Malformed http requests cause an exception to be thrown in the parsley http library that causes the bouncy script to crash. I need to get a better grip on error handling. Right now I have the script restart itself on crash but that's obviously non-optimal. Once this is fixed, I expect it to be as highly reliable as any other web front end. It might be simply catching the exception in the bouncy script, but I haven't looked into it yet.

30-Jan-2013 edit: two days after writing this post on bouncy 1.3, I learn of bouncy 2.0 which changes the API. After an initial attempt to upgrade the script to 2.0 causes mysterious failures, I went back to 1.3.

As far as features, what I would like is a way to change the configuration without dropping the existing cients. Currently changing the bouncy script means restarting node.js which kills the existing connections. Mongrel2, I have read, does a good job of this.

Another missing feature is to take action based on the response of the webapp. If at attempt to forward to port 2500 results in an http 500 response for example, the brower will receive the response when I would like to have the option to catch that error response in bouncy and redirect the browser to a different page/site.

Published on 28 Jan 2013 at 05:12PM .

0 comments

I made some notes on paper and wanted to get them online. They're next steps for the Ghost Town Kingdom.

Cointhink

User selectable exchanges for the standard opportunity algorithm. User created javascript for highly custom reporting.

EveryoneDelivers

Mostly needs advertising. One unique angle is trying for non-monetary incentives.

EverythingFunded

Create an API. Make it easy for white label crowdfunding sites to use.

Financial Calendar

Lots of WePay integration potential. Moving money between wepay sub-accounts for savings/goal reaching.

Geomena

Move to rethinkdb mostly as an exercise to use rethink and move off couchdb.

IceCondor

Also move to rethinkdb.

Published on 08 Jan 2013 at 07:42PM .

0 comments

Displaying the time in a web page ofter runs into the problem of which timezone to display. The server's timezone is the easiest yet likely to be incorrect. A logged-in user might have a timezone specified already. The best case for most circumstances is to use the browser's timezone. Its the best way to get a local time for the user looking at the page. Since only the browser knows its local timezone, some client-side javascript is necessary. The server generated html provides a UTC time and the browser converts it to localtime for display.

The system I use is below. Use your server side templating language to fill in the datetime field with an iso8601 date string.

Strategy #64 
<time data-format="yyyy-MMM-dd HH:mm" 
        datetime="2012-12-06T18:57:02Z"></time>

Loop through the time tags in jQuery. I use the xdate javascript time library because of its flexibility.

$('time').each(function(){
  var datetime = new XDate($(this).attr('datetime'))
  var formatted = datetime.toString($(this).attr('data-format'))+" "
  $(this).html(formatted)
})

Which generates the string "Strategy #64 2012-Dec-06 10:57" in html for my west-coast located web browser. Every time tag gets the same treatment. Done!

Published on 10 Dec 2012 at 07:19PM .

0 comments

I believe it'd be useful information to have a long-term log of heartrate. One example of such a device is this wearable pulse oximeter. Its $500 and records 80 hours of data.

A simpler version (no oximeter, just pulse) I believe could be done for very little cost. Using the remarkable teensy microcontroller board, a light, and a lightsensor, I am to make a wearable long-term heartrate monitor for myself and the Portland Quantified Self crowd. The parts should be a $19 teensy, a $5 battery, a $3 sensor, and a $1 led.

A video is here that shows the teensy and what the sensor looks like. Its currently measuring the amount of light hitting the sensor and displaying a relative value.

Challenges are

  • making the sensor, LED, teensy, and battery (not shown) easy to wear.
  • the algorithm to detect a heartbeat from the change in light intensity

Published on 07 Dec 2012 at 06:33PM .

0 comments

in looking for an interesting project for the digispark, I settled on a wearable heartrate monitor.

The microcontroller will be worn at the wrist. The photo here is a Teenys 3, but I hope to use the digispark when it arrives in the mail since its much smaller. microcontroller

An LED + light meter will be worn on the fingertip. led and light sensor

The data will either go onto an SD card or use a Bluetooth4 radio to send it to a cell phone.

Published on 30 Nov 2012 at 09:38PM .

0 comments

Powered by Typo