Blog moved

I’ve relocated my blog to the new URL http://www.pizzaandcode.com/

I hope I’ll see you there!

Which company will Oracle acquire next?

Stephen Jannise, ERP Market Analyst at ERP Software Advice (http://www.softwareadvice.com/), has written an article that looks back at the past five years of Oracle acquisitions in an attempt to define a few key strategies that have guided the company’s purchasing decisions.

Based on these strategies, Stephen suggests fourteen potential targets that, in one way or another, are compatible with Oracle’s recent plans or suggest bold, new directions for the company.

Stephen has encouraged readers to voice their own opinions by voting in a poll that asks, “Which company will Oracle acquire next?”

Be sure to visit the poll here (http://www.softwareadvice.com/articles/manufacturing/oracle-mergers-acquisitions-whos-next-1080310/) and let everyone know where you stand.

BarCamp Boston 5 on April 17th & 18th, 2010 at the MIT Stata Center

BarCamp is coming back to Boston on April 17th and 18th. Check out http://wiki.barcampboston.org/

I’ve offered to lead a session on

Database Scalability: Is the relational database really dead?

I notice that Omer Trajman wants to lead a session on Hadoop. If you want to know anything about Hadoop/MapReduce, attend his session. I can fairly say that Omer knows a thing or two (or three) about Hadoop :)

It is 2010 and RAID5 still works …

This blog has moved. Please visit this post at the new blog here.

Some years ago (2007, 2008) when I cared a little more about things like RAID and RAID recovery, I read an article in ZDNET by Robin Harris that made the case for why disk capacity increases coupled with an almost invariant URE (Unrecoverable Read Error) rate meant that RAID5 was dead in 2009. A follow-on article appeared recently, also by Robin Harris that extends the same logic and claims that RAID6 would stop working in 2019.
The crux of the argument is this. As disk drives have become larger and larger (approximately doubling in two years), the URE has not improved at the same rate. URE measures the frequency of occurrence of an Unrecoverable Read Error and is typically measured in errors per bits read. For example an URE rate of 1E-14 (10 ^ -14) implies that statistically, an unrecoverable read error would occur once in every 1E14 bits read (1E14 bits = 1.25E13 bytes or approximately 12TB).
Further, Robin argues that a RAID array (RAID5 or RAID6) is running normally when a drive suffers a catastrophic failure that prompts a reconstruction from parity. In that scenario, it is perfectly conceivable that while reading the (N-1) data drives and the parity stripe in order to rebuild the failed data drive, a single URE may occur. That URE would render the RAID volume failed.
The argument is that as disk capacities grow, and URE rate does not improve at the same rate, the possibility of a RAID5 rebuild failure increases over time. Statistically he shows that in 2009, disk capacities would have grown enough to make it meaningless to use RAID5 for any meaningful array.
So, in 2007 he wrote:

RAID 5 protects against a single disk failure. You can recover all your data if a single disk breaks. The problem: once a disk breaks, there is another increasingly common failure lurking. And in 2009 it is highly certain it will find you.

and in 2009, he wrote:

SATA RAID 6 will stop being reliable sooner unless drive vendors get their game on. More good news: one of them already has.

The logic proposed is accurate but, IMHO, incomplete. One important aspect that the analysis fails to point out is something that RAID vendors have already been doing for many years now.
Image courtesy of www.computer-history.info
When disk drives looked like this (picture at right), the predominant failure mode was the catastrophic failure. Drives either worked or didn’t work any longer. At some level, that was a reflection of the fact that the Drive Permanent Failure (DPF) frequency was significantly higher than the URE frequency, and therefore the only observed failure mode was catastrophic failure.
As drives got bigger, and certainly in 1988 when Patterson and others first proposed the notion of RAID, it made perfect sense to wait for a DPF and then begin drive reconstruction. The possibility of a URE was so low (given drive capacities) that all you had to worry about was the rebuild time, and the degraded performance during the rebuild (as I/O’s may have to be satisfied through block reconstruction).
But, that isn’t how most RAID controllers today deal with drive URE’s and drive failures. On the contrary, for some time now, RAID controllers (at least the recent ones I’ve read about) have used better methods to determine when to perform the rebuild.
A 5400 RPM SATA DriveConsider this alternative, that I know to be used by at least a couple of array vendors. When a drive in a RAID volume reports a URE, the array controller increments a count and satisfies the I/O by rebuilding the block from parity. It then performs a rewrite on the disk that reported the URE (potentially with verify) and if the sector is bad, the microcode will remap and all will be well.
When the counter exceeds some threshold, and with the disk that reported the URE still in a usable condition, the RAID controller will begin the RAID5 recovery. Robin is correct that RAID recovery after DPF is something that will become less and less useful as drive capacities grow. But, with improvements in integration of SMART and the significant improvements in the predictability of drive failures, the frequency of RAID5 and RAID6 reconstruction failures are dramatically lower than those predicted in the referenced articles as these reconstructions occur on URE and not DPF.
Look at the specifications for the RAID controller you use.

When is RAID recovery initiated? Upon the occurrence of an Unrecoverable Read Error (URE) or upon the occurrence of a Drive Permanent Failure (DPF)?

Several have proposed ZFS with multiple copies is the way to go. While it addresses the issue, I submit to you that it is at the wrong level of the stack. Mirroring at the block level, with the option to have multiple mirrors is the correct (IMHO) solution. Disk block error recovery should not be handled in the file system.

The Application Marketplace: Android’s worst enemy?

I recently got an Android (Motorola A855, aka droid) phone. I had been using a Windows based device (have been since about 2003). I was concerned about the bad reviews of poor battery life and the fact that Bluetooth Voice Dialing was not present. I figured that the latter was a software thing and could be added later. So, with some doubt, I started using my phone.

On the first day, with a battery charged overnight,  I proceeded to surf the Marketplace and download a few applications. I got a Google Voice Dialer (not the one from Google), and a couple of other “marketplace” applications. I used the maps with the GPS for a short while and in about 8 hours the yellow sign of “low battery” came on. I had Google (GMAIL) synchronization set to the default (sync enabled).

Pretty crappy, I thought. My Samsung went for two days without a problem. I had activesync with server (Exchange) or GMail refresh every 5 minutes for years!

The Google Voice dialer I downloaded had some bugs (it messed up the call log pretty badly) and I got bored of the other applications I had downloaded.

Time for a hard reset and restart for the phone (just to be sure I got rid of all the gremlins. After all, I was a Windows phone user, this was a weekly ritual).

I got the update to Google Maps, set synch to continuous, downloaded the “sky map” application and charged the phone up fully. That was on Wednesday afternoon (17th). Today is the 20th and the battery is still all green on the home page.

The robustness of downloaded Android Apps

One of the things that makes the android phone so attractive (the application marketplace) is certainly a big problem. The robustness and stability of the downloaded applications cannot be guaranteed. We all realize that “your mileage may vary”. But, a quick look at the “Best Practices” on the android SDK site indicate that a badly written application can keep the CPU too busy and burn through your battery.

Maybe Android phones (and the battery life in particular) is more an issue of poorly written applications.

Apple (with the Macintosh) had a tight grip on the applications that could be released on the Mac. This helped them ensure that buggy software didn’t give the Mac a bad name. I’m sure Windows users can relate to this.

They seem to have the same control on the iPhone App Store. Maybe that’s why I don’t hear so much about crappy applications on the iPhone that crash or suck the battery dry!

Should Google take some control over the crap on the marketplace or will it all straighten itself out over time?

ZL Technologies vs. Gartner (Part 2)

In an earlier post, I had commented on the lawsuit between ZL Technologies and Gartner.

Today, I received an email from Rob Elliott of ZL Technologies. The email does not say what position Mr. Elliott holds at ZL Technologies.

The link it refers to as “here” points to here. I have not had a chance to go and look at the amended complaint.

A Day of Straight Talk on Cloud Computing, Coming December 10

http://www.xconomy.com/boston/2009/11/16/a-day-of-straight-talk-on-cloud-computing-coming-december-10/

Save the date! I met with Wade last week and he mentioned this event.

Punishment must fit the crime

I regularly read Dr. Dobbs Code Talk and noticed this article today. What caught my attention was not the article itself, but rather the first response to the article from Jack Woehr.

Reproduced below is a screen shot of the page that I read and Jack’s comments. Really, I ask you, is C# all that bad?

Microsoft patent 7617530, the flap about sudo

The blogosphere has been buzzing with indignation about a Microsoft patent application 7617530 that apparently was granted earlier this month. You can read the application here.

Yes, enough people have complained that this is like sudo and why did Microsoft get a patent for this. In fairness the patent does attempt to distinguish what is being claimed from sudo and provides copious references to sudo. What few have mentioned is that the thing that Microsoft patents is in fact the exact functionality that some systems like Ubuntu use to allow non-privileged users to perform privileged tasks.

In PC Magazine, Matthew Murray writes,

Because a graphical interface is not a part of sudo, it seems clear the patent refers to a Windows component and not a Linux one. The patent even references several different online sudo resources, further suggesting Microsoft isn’t trying to put anything over on anyone. The same section’s reference to “one, many, or all accounts having sufficient rights” suggests a list that sudo also doesn’t possess.

IMHO, they may be missing something here.

Let’s set that all aside. What I find interesting is this. The patent application states, and I reproduce three paragraphs of the patent application here and have highlighted three sentences (the first sentences in each paragraph).

Standard user accounts permit some tasks but prohibit others. They permit most applications to run on the computer but often prohibit installation of an application, alteration of the computer’s system settings, and execution of certain applications. Administrator accounts, on the other hand, generally permit most if not all tasks.

Not surprisingly, many users log on to their computers with administrator accounts so that they may, in most cases, do whatever they want. But there are significant risks involved in using administrator accounts. Malicious code may, in some cases, perform whatever tasks are permitted by the account currently in use, such as installing and deleting applications and files–potentially highly damaging tasks. This is because most malicious code performs its tasks while impersonating the current user of the computer–thus, if a user is logged on with an administrator account, the malicious code may perform dangerous tasks permitted by that account.

To reduce these risks, a user may instead log on with a standard user account. Logging on with a standard user account may reduce these risks because the standard user account may not have the right to permit malicious code to perform many dangerous tasks. If the standard user account does not have the right to perform a task, the operating system may prohibit the malicious code from performing that task. For this reason, using a standard user account may be safer than using an administrator account.

Absolutely! Most people don’t realize that they are logged in as users with Administrator rights and can inadvertently do damaging things.

My question is this: why is the default user created when you install Windows on a PC an administrator user? As you go through the install process, the thing asks you questions like “what is your name” and “how would you like to login to your PC”. It uses this to setup the first user on the machine. Why is that user an administrator user?

If you are smart (and if Microsoft really wanted to be good about this) the installation process would create two users. A day-to-day user who is non-Administrator, and an Administrator user.

I’m a PC and if Windows 8 comes up with an installation process that creates two users, a non-administrator user and an administrator user, then it would have been my idea. But, I don’t intend to go green holding my breath for this to happen. Someone tell me if it does.

A Report from Boston’s First “Big Data Summit”

A short write-up about last night’s Big Data Summit appeared on xconomy today.

My thanks to our sponsors, Foley Hoag LLP and the wonderful team at the Emerging Enterprise Center, Infobright, Expressor Software, and Kalido.

ZL Technologies vs. Gartner

I bumped headlong into this article by Mike Masnick on Techdirt. After reading Mike’s summary, I was overcome by a morbid desire to know more. Here is what Mike had to say,

a recent lawsuit filed by ZL Technologies, because ZL doesn’t like how Gartner ranked it in Gartner’s famous “magic quadrant” analysis, is pretty silly, and hopefully will get thrown out quickly.

I followed the trail here. There’s a description of why ZL Technologies feels that it was adversely affected. Maybe I am over-simplifying this but the argument is that ZL Technologies was adversely affected by being placed in a less “attractive” neighborhood on the Magic Quadrant. Gartner claims that they are free to do so and are protected by their first amendment rights as they are stating their “opinion”. ZL Technologies counters with this response.

Some excerpts from the filing are below.

The large institutions that are the potential customers for ZL’s products rely heavily on outside advice when making their purchasing decisions. The market for providing that advice is dominated by Gartner, a behemoth with $1.3 billion in annual revenues that sells research reports and consulting services to institutional technology consumers, and exercises make-or-break power over the technology providers whose products are aimed at such purchasers.

Because of negative statements in Gartner’s reports and Gartner’s low ranking of ZL’s email archiving software, ZL is often not even invited to respond to requests for proposals, or RFPs, issued by potential customers.

Complaint 32. Placement in that quartile renders ZL a “Niche Player,” and identifies ZL’s performance as inferior in both the “Ability to Execute” and “Completeness of Vision” areas.

Gartner argues that it cannot be held liable on any theory for any of its statements because they are “opinions,” and that the First Amendment erects a per se barrier to liability based on any expressions of opinion. That argument is overly simplistic, legally erroneous and factually inapposite. Qualifying an assertion as “opinion” is not the constitutional equivalent of crossing one’s fingers, and the United States Supreme Court has squarely rejected the proposition that the First Amendment creates “a wholesale defamation exemption for anythingthat might be labeled ‘opinion.’”

In the first instance, Gartner’s opinion defense fails because its statements concerning ZL’s products were assertions of fact, not opinion. The Ninth Circuit has established a three-part test for determining whether a statement is an assertion of fact or opinion. The standard first examines whether the defendant used figurative or hyperbolic language that negates the impression that the defendant was asserting an objective fact; second, whether the general tenor of the entire work negates that impression; and third, whether the statement at issue is capable of being proved true or false. Unelko Corp. v. Rooney, 912 F.2d 1049, 1053 (9th Cir. 1990). Under that rubric, Gartner’s defamatory statements are clearly assertions of fact.

The total amount they are claiming is, get this, $1.3b in settlement. The best comment in this whole article is this one where Paul McNamara says:

That’s not a judgment request, it’s an exit strategy.

It is going to be interesting to see what happens in this case. I asked a friend (an attorney who was familiar with this case) and he is following the case closely as well. One way or another, this case will likely be carefully watched by technology companies.

Wow! Google Documents can now share folders.

Wow! This is wonderful. Just logged into Google Documents and looked at the “cookie jar” space on the top right.

Folder Sharing in Google Docs!

Folder Sharing in Google Docs!

That’s cool! And you even get to tell Google where to put it!

And you can tell Google exactly where to put it!

And you can tell Google exactly where to put it!

Boston Big Data Summit Kickoff, October 22nd 2009

BBD_logoSince the announcement of the Boston Big Data Summit on the 2nd of October, we have had a fantastic response. The event sold out two days ago. We figured that we could remove the tables from the room and accommodate more people. And, we sold out again. The response has been fantastic!

If you have registered but you are not going to be able to attend, please contact me and we will make sure that someone on the waiting list is confirmed.

There has been some question about what “Big Data” is. Curt Monash who will be delivering the keynote and moderating the discussion at the event next week writes:

… where “Big Data” evidently is to be construed as anything from a few terabytes on up.  (Things are smaller in the Northeast than in California …)

Little FishBig FishWhen you catch a fish (whether it is the little fish on the left or the bigger fish on the right), the steps to prepare it for the table are surprisingly similar. You may have more work to do with the big fish and you may use different tools to do it with; but the things are the same.

So, while size influences the situation, it isn’t only about the size!

In my opinion, whether data is “Big” or not is more of a threshold discussion. Data is “Big” if the tools and techniques being used to acquire, cleanse, pre-process, store, process and archive, are either unable to keep up, or are not cost effective.

Yes, everything is bigger in California, even the size of the mess they are in. Now, that is truly a “Big Problem”!

The 50,000 row spreadsheet, the half a terabyte of data in SQL Server, or the 1 trillion row table on a large ADBMS are all, in their own ways, “Big Data” problems.

The user with 50k rows in Excel may not want  ( or be able to afford ) a solution with a “real database”, and may resort to splitting the spreadsheet into two sheets. The user with half a terabyte of SQL Server or MySQL data may adopt some home-grown partitioning or sharding technique instead of upgrading to a bigger platform, and the user with a trillion CDR’s may reduce the retention period; but they are all responding to the same basic challenge of “Big Data”.

We now have three panelists:

It promises to be a fun evening.

I have some thoughts on subjects for the next meeting, if you have ideas please post a comment here.

Massachusetts Non-Compete Public Hearing

A quick update on the Public Hearings at the Joint Committee on Labor and Workforce Development held in Boston on October 7th, 2009.

Today I went to State House in Boston and testified before the Joint Committee on Labor and Workforce Development on the subject on Non-Competes in the state. The hearings today were dominated by bills that had to do with “paid sick days”. Here is the days agenda

invite

If you were a mother and wanted to make the case for paid sick days to care for your child, what would be better than to bring your child with you when you are about to testify to the Committee on Labor and Workforce Development on a bill about paid sick days? To be fair, the child sat quietly and ate a peanut butter and jelly sandwich and at one point tried to help read out her mother’s prepared testimony.

After hearing the testimony from several people and seeing how many children there were in the room just drove home the point that many people made. When their children were sick, they had to take them along to work because they could not risk losing their jobs. That’s just wrong; I had assumed that most people had paid sick leave. Unfortunately, I learned today that this is not the case.

Hearing on Paid Sick Days

Testifying on the bills to allow Paid Sick Days.

Read the rest of this entry »

On MapReduce and Relational Databases – Part 1

This blog has moved. Please visit us at the new location.

This post can be found at the new address.

 

This is the first of a two-part blog post that presents a perspective on the recent trend to integrate MapReduce with Relational Databases especially Analytic Database Management Systems (ADBMS).

The first part of this blog post provides an introduction to MapReduce, provides a short description of the history and why MapReduce was created, and describes the stated benefits of MapReduce.

The second part of this blog post provides a short description of why I believe that integration of MapReduce with relational databases is a significant mistake. It concludes by providing some alternatives that would provide much better solutions to the problems that MapReduce is supposed to solve.
Read the rest of this entry »

On MapReduce and Relational Databases – Part 2

This blog has moved, please visit us at the new address.

This post is now available at the new location.

This is the second of a two-part blog post that presents a perspective on the recent trend to integrate MapReduce with Relational Databases especially Analytic Database Management Systems (ADBMS).

The first part of this blog post provides an introduction to MapReduce, provides a short description of the history and why MapReduce was created, and describes the stated benefits of MapReduce.

The second part of this blog post provides a short description of why I believe that integration of MapReduce with relational databases is a significant mistake. It concludes by providing some alternatives that would provide much better solutions to the problems that MapReduce is supposed to solve.
Read the rest of this entry »

Announcing the Boston Big Data Summit

BBD_logo

The Boston “Big Data Summit” will be holding its first meeting on Thursday, October 22nd 2009 at 6pm at the Emerging Enterprise Center at Foley Hoag in Waltham, MA.

The Boston area is home to a large number of companies involved in the collection, storage, analysis, data integration, data quality, master data management, and archival of “Big Data”. If you are involved in any of these, then the meeting of the Boston “Big Data Summit” is something you should plan to attend. Save the date!

The first meeting of the group will feature a discussion of “Big Data” and the challenges of “Big Data” analysis in the cloud.

Over 120 people signed up as of October 14th 2009.

There is a waiting list. If you are registered and won’t be able to attend, please contact me so we can allow someone on the wait list to attend instead.

Seating is limited so go online and register for the event at http://bigdata102209.eventbrite.com.

The Boston “Big Data Summit” thanks the Emerging Enterprise Center at Foley Hoag LLP for their support and assistance in organizing this event.

Agenda Updated

The Boston “Big Data Summit” is being sponsored by Foley Hoag LLP, Infobright, Expressor  Software, and Kalido

For more information about the Boston “Big Data Summit” please contact the group administrator at boston.bigdata@gmail.com


The Boston Big Data Summit is organized by Bob Zurek and Amrith (me) in partnership with the Emerging Enterprise Center at Foley Hoag LLP.


Tell me about something you failed at, and what you learnt from it.

This blog has moved. An updated post and comments are now at the new address.

I have been involved in a variety of interviews both at work and as part of the selection process in the town where I live. Most people are prepared for questions about their background and qualifications. But, at a whole lot of recent interviews that I have participated in, candidates looked like deer in the headlight when asked the question (or a variation thereof),

“Tell me about something that you failed at and what you learned from it”

A few people turn that question around and try to give themselves a back-handed compliment. For example, one that I heard today was,

“I get very absorbed in things that I do and end up doing an excellent job at them”

Really, why is this a failure? Can’t you get a better one?

Folks, if you plan to go to an interview, please think about this in advance and have a good answer to this one. In my mind, not being able to answer this question with a “real failure” and some “real learnings” is a disqualifier.

One thing that I firmly believe is that failure is a necessary by-product of showing initiative in just the same way as bugs are natural by-product of software development. And, if someone has not made mistakes, then they probably have not shown any initiative. And if they can’t recognize when they have made a mistake, that is scary too.

Finally, I have told people who have been in teams that I managed that it is perfectly fine to make a mistake; go for it. So long as it is legal, within company policy and in keeping with generally accepted norms of behavior, I would support them. So, please feel free to make a mistake and fail, but please, try to be creative and not make the same mistake again and again.

Oracle fined $10k for violating TPC’s fair use rules

In a letter dated September 25, 2009, the TPC fined Oracle $10k based on a complaint filed by IBM. You can read the letter here.

Recently, Oracle ran an advertisement in The Wall Street Journal and The Economist making unsubstantiated superior performance claims about an Oracle/Sun configuration relative to an official TPC-C result from IBM. The ad ran twice on the front page of The Wall Street Journal (August 27, 2009 and September 3, 2009) and once on the back cover of The Economist (September 5, 2009). The ad references a web page that contained similar information and remained active until September 16, 2009. A complaint was filed by IBM asserting that the advertisement violated the TPC’s fair use rules.

Oracle is required to do four things:

1. Oracle is required to pay a fine of $10,000.
2. Oracle is required to take all steps necessary to ensure that the ad will not be published again.
3. Oracle is required to remove the contents of the page www.oracle.com/sunoraclefaster.
4. Oracle is required to report back to the TPC on the steps taken for corrective action and the procedures implemented to ensure compliance in the future.

At the time of this writing, the link http://www.oracle.com/sunoraclefaster is no longer valid.

Can you copyright movie times?

MovieShowtimes.com, a site owned by West World Media believes that they have!

In his article, Michael Masnick relates the experience of a reader Jay Anderson who found a loophole on a web page MovieShowtimes.com and figured out how to get movie times for a given zip code. He (Jay Anderson) then contacted the company asking how he could become an affiliate and drive traffic their way and was rewarded with some legal mumbo jumbo.

First of all, I think the minion at the law firm was taking a course on “Nasty Letter Writing 101″ and did a fine job. I’m no copyright expert but if I received an offer from someone to drive more traffic to my site my first answer would not be to get a lawyer involved.

Second, this whole episode could have well been featured in the book, Letters from a Nut, by Ted L. Nancy or the sequel More Letters from a Nut.

But, this reminds me of something a former co-worker told me about an incident where his daughter wrote a nice letter to a company and got her first taste of legal over zealousness. He can correct the facts and fill in the details but if I recall correctly, the daughter in question had written letters to many companies asking the usual childrens questions about how pretzels, or candy or a nice toy was made. In response some nice person in a marketing department sent a gift hamper back with a polite explanation of the process etc., But one day the little child wanted to know (if my memory serves me correctly) why M&M’s were called M&M’s. So, along went the nice letter to the address on the box. The response was a letter from the say guy who now works for MovieTimesForDummies.com explaining that M&M’s was a copyright of the so-and-so-company and any attempt to blah blah blah.

I think it is only a matter of time before MovieTimesForDummies.com releases exactly the same app that Jay Anderson wanted to, closes the loophole that he found and fires the developer who left it there in the first place.

Oh, wait, I just got a legal notice from Amazon saying that the link on this blog directing traffic to their site is a violation of something or the other …

Multithreaded File I/O (Reflections on Dr. Dobb’s article by Stefan Wörthmüller)

I ran across an interesting article on Multi-Threaded File I/O in Dr. Dobb’s today. You can read the article at http://www.ddj.com/hpc-high-performance-computing/220300055

I was particularly intrigued by the statements on variability,

I repeated the entire test suite three times. The values I present here are the average of the three runs. The standard deviation in most cases did not exceed 10-20%. All tests have been also run three times with reboots after every run, so that no file was accessed from cache.

Initially, I thought 10-20% was a bit much; this seemed like a relatively straightforward test and variability should be low. Then I looked at the source code for the test and I’m now even more puzzled about the variability.

Get a copy of the sources here. It is a single source file and in the only case of randomization, it uses rand() to get a location into the file.

The code to do the random seek is below

   if(RandomCount)
   {
      // Seek new position for Random access
      if(i >= maxCount)
         break;
      long pos = (rand() * fileSize) / RAND_MAX - BlockSize;
      fseek(file, pos, SEEK_SET);
   }

While this is a multi-threaded program, I see no calls to srand() anywhere in the program. Just to be sure, I modified Stefan’s program as attached here. (My apologies, the file has an extension of .jpg because I can’t upload a .cpp or .zip onto this free wordpress blog. The file is a Windows ZIP file, just rename it).

///////////////////////////////////////////////////////////////////////////////
// mtRandom.cpp   Amrith Kumar 2009 (amrith (dot) kumar (at) gmail (dot) com
// This program is adapted from the program FileReadThreads.cpp by Stefan Woerthmueller
// No rights reserved. Feel Free to do what ever you like with this code
// but don't blame me if the world comes to an end.

#include "Windows.h"
#include "stdio.h"
#include "conio.h"
#include
#include 

#include
#include 

///////////////////////////////////////////////////////////////////////////////
// Worker Thread Function
///////////////////////////////////////////////////////////////////////////////

DWORD WINAPI threadEntry(LPVOID lpThreadParameter)

{
    int index = (int)lpThreadParameter;
        FILE * fp;
        char filename[32];

        sprintf ( filename, "file-%d.txt", index );

        fprintf ( stderr, "Thread %d started\n", index );
        if ((fp = fopen ( filename, "w" )) == (FILE * ) NULL)
        {
                fprintf (stderr, "Error opening file %s\n", filename );
        }
        else
        {
                for (int i = 0; i < 10; i ++)
                {
                        fprintf ( fp, "%u\n", rand());
                }

                fclose (fp);
        }

        fprintf ( stderr, "Thread %d done\n", index );

    return 0;
}

#define MAX_THREADS (5)

int main(int argc, char* argv[])

{
    HANDLE h_workThread[MAX_THREADS];

    for(int i = 0; i < MAX_THREADS; i++)
    {
        h_workThread[i] = CreateThread(NULL, 0, threadEntry, (LPVOID) i, 0, NULL );
        Sleep(1000);
    }

    WaitForMultipleObjects(MAX_THREADS, h_workThread, TRUE, INFINITE);
    printf ( "All done. Good bye\n" );
    return 0;
}

So, I confirmed that Stefan will be getting the same sequence of values from rand() over and over again, across reboots.

Why then is he still seeing 10-20% variability? Beats me, something smells here … I would assume that from run to run, there should be very little variability.

Thoughts?

From the “way-back machine”

We’ve all heard the expression “way-back machine” and some of us know about tools like the Time Machine. But, did you know that there is in fact a “way-back machine” ?

From time to time, I have used this service and it is one of those nice corners of the web that is nice to know. I was reminded of it this morning in a conversation and that led to a nice walk through history.

If you aren’t familiar with the “way-back machine”, take a look at http://www.archive.org/web/web.php

Some day you may wonder what a web page looked like a while ago and the “way-back machine” is your solution.

Here are some interesting ones that I looked at today. The Time Magazine in February 1999.

Time magazine web page in February 1999

Time magazine web page in February 1999

Ever wondered what the Dataupia web page looked like in February 2006? I know someone who would get a kick out of it so I went and looked it up.

The Dataupia web page from February 2006

The Dataupia web page from February 2006

Check it out sometime, the way back machine is a wonderful afternoon diversion.

The “way back” archive is not complete, alas!

Florida recounts

Diluting education standards in Kansas (part II)

Coming in the aftermath of the efforts to outlaw the teaching of evolution in the state, this story about Kansas is unfortunate.

http://blog.acm.org/archives/csta/2009/09/post_4.html

http://usacm.acm.org/usacm/weblog/index.php?p=741

The state has significant employment problems and the recent down turn in the economy has caused significant impact on the aircraft industry in the state. With a nascent IT start-up scene there, this is probably the worst publicity that the state could have hoped for.

Who are you, really? The value of incorrect response in challenge-response style authentication.

Service providers (electricity, cable, wireless phone, POTS telephone, newspaper, banks, credit card companies) are regularly faced with the challenge of identifying and validating the identity of the individual who has called customer service. They have come up with elaborate schemes involving the last four digits of your social security number, your mailing address, your mother’s maiden name, your date of birth and so on. The risks associated with all of these have been discussed at great length elsewhere; social security numbers are guessable (see “Predicting Social Security Numbers from Public Data”, Acquisti and Gross), mailing addresses can be stolen, mother’s maiden names can be obtained (and in some Latin American countries your mother’s maiden name is part of your name) and people hand out their dates of birth on social networking sites without a problem!

Bogus Parking ticket

Bogus Parking ticket

So, apart from identity theft by someone guessing at your identity, we also have identity theft because people give out critical information about themselves. Phishing attacks are well documented, and we have heard of the viruses that have spread based on fake parking tickets.

Privacy and Information Security experts caution you against giving out key information to strangers; very sound advice. But, how do you know who you are talking to?

Consider these two examples of things that have happened to me.

1. I receive a telephone call from a person who identifies himself as being an investment advisor from a financial services company where I have an account. He informs me that I am eligible for a certain service that I am not utilizing and he would like to offer me that service. I am interested in this service and I ask him to tell me more. In order to tell me more, he asks me to verify my identity. He wants the usual four things and I ask him to verify in some way that he is in fact who he claims to be. With righteous indignation he informs me that he cannot reveal any account information until I can prove that I am who I claim to be. Of course, that sets me off and I tell him that I would happily identify myself to be who he thinks I am, if he can identify that he is in fact who he claims to be. Needless to say, he did not sell me the service that he wanted to.

2. I call a service provider because I want to make some change to my account. They have “upgraded their systems” and having looked up my account number and having “matched my phone number to the account”, the put me through to a real live person. After discussing how we will make the change that I want, the person then asks me to provide my address. Ok, now I wonder why that would be? Don’t they have my address, surely they’ve managed to send me a bill every month.

“For your protection, we need to validate four pieces of information about you before we can proceed”, I am told.

The four items are my address, my date of birth, the last four digits of my social security number and the “name on the account”.

Of course, I ask the nice person to validate something (for example, tell me how much my last bill was) before I proceed. I am told that for my own protection, they cannot do that.

challenge-responseComputer scientists have developed several techniques that provide “challenge-response” style authentication where both parties can convince themselves that they are who they claim to be. For example, public-key/private-key encryption provides a simple way in which to do this. Either party can generate a random string and provide it to the other asking the other to encrypt it using the key that they have. The encrypted response is returned to the sender and that is sufficient to guarantee that the peer does in fact posses the appropriate “token”.

In the context of a service provider and a customer, there would be a mechanism for the service provider to verify that the “alleged customer” is in fact the customer who he or she claims to be but the customer also verifies that the provider is in fact the real thing.

The risks in the first scenario are absolutely obvious; I recently received a text message (vector) that read

“MsgID4_X6V…@v.w RTN FCU Alert: Your CARD has been DEACTIVATED. Please contact us at 978-596-0795 to REACTIVATE your CARD. CB: 978-596-0795″

A quick web search does in fact show that this is a phishing event. Whether someone tracked that phone number down and find out if they are a poor unsuspecting victim or a perpetrator, I am not sure.

But, what does one do when in fact they receive an email or a phone call from a vendor with whom they have a relationship?

One could contact a psychic to find out if it is authentic, like check the New England SEERs.

http://twitter.com/ILNorg/status/3786484194

http://twitter.com/NewEnglandSEERs

RT @Lucy_Diamond 978-596-0795 do not return call on text. Call police or your real bank. Caution bank fraud. Never give your pin to anyone

RT @Lucy_Diamond Warning bank scam via cell phone text remember never give your pin number to anyone. Your bank won’t ask you they know it

Agent: For your security please verify some information about your account.What is your account number

Me: Provide my account number

Agent: Thank you, could you give me your passphrase?

Me: ketchup

Agent: Thank you. Could you give me your mother’s maiden name

Me: Hoover Decker

Agent: Thank you. and the last four digits of your SSN

Me: 2004

Agent: Just one more thing, your date of birth please

Me: February 14th 1942

Agent: Thank you

Agent: For your security please verify some information about your account.What is your account number

Me: Provide my account number

Agent: Thank you, could you give me your passphrase?

Me: ketchup

Agent: That’s not what I have on the account

Me: Really, let me look for a second. What about campbell?

Agent: No, that’s not it either. It looks like you chose something else, but similar.

Me: Oh, of course, Heinz58. Sorry about that

Agent: That’s right, how about your mother’s maiden name.

Me: Hoover Decker

Agent: No, that’s not it.

Me: Sorry, Hoover Bissel

Agent: That’s right. And the last four of your social please

Me: 2007

Agent: thank you, and the date of birth

Me: Feb 29, 1946

Agent: Thank you

Agent: For your security please verify some information about your account.

What is your account number

Me: Provide my account number

Agent: Thank you, could you give me your passphrase?

Me: ketchup

Agent: Thank you. Could you give me your mother’s maiden name

Me: Hoover Decker

Agent: Thank you. and the last four digits of your SSN

Me: 2004

Agent: Just one more thing, your date of birth please

Me: February 14th 1942

Agent: Thank you. Could you verify the address to which you would like us to ship the package.

(At this point, I’m very puzzled and not really sure what is going on)

Me: Provided my real address (say 10 Any Drive, Somecity, 34567)

Agent: I’m sorry, I don’t see that address on the account, I have a different address.

Me: What address do you have?

Agent: I have 14 Someother Drive, Anothercity, 36789.

The address the agent provided was in fact a previous location where I had lived.

What has happened is that the cable company (like many other companies these days) has outsourced the fulfillment of the orders related to this service. In reality, all they want is to verify that the account number and the address match! How they had an old address, I cannot imagine. But, if the address had matched, they would have mailed a little package out to me (it was at no charge anyway) and no one would be any the wiser.

But, I hung up and called the cable company on the phone number on my bill and got the full fourth-degree. And they wanted to talk to “the account owner”. But, I had forgotten what I told them my SSN was … Ironically, they went right along to the next question and later told me what the last four digits of my SSN were :)

Someone said they were interested in the security and privacy of my personal information?

We people born on the 29th of February 1946 are very skeptical.

http://twitter.com/NewEnglandSEERs

Faster or Free

I don’t know how Bruce Scott’s article showed up in my mailbox but I’m confused by it (happens a lot these days).

I agree with him that too much has been made about whether a system is a columnar system or a truly columnar system or a vertically partitioned row store and what really matters to a customer is TCO and price-performance in their own environment. Bruce says as much in his blog post

Let’s start talking about what customers really care about: price-performance and low cost of ownership. Customers want to do more with less. They want less initial cost and less ongoing cost.

Then, he goes on to say

On this last point, we have found that we always outperform our competitors in customer created benchmarks, especially when there are previously unforeseen queries. Due to customer confidentiality this can appear to be a hollow claim that we cannot always publicly back up with customer testimonials. Because of this, we’ve decided to put our money where our mouth is in our “Faster or Free” offer. Check out our website for details here: http://www.paraccel.com/faster_or_free.php

So, I went and looked at that link. There, it says:

Our promise: The ParAccel Analytic Database™ will be faster than your current database management system or any that you are currently evaluating, or our software license is free (Maintenance is not included. Requires an executed evaluation agreement.)

To be consistent, should that not make the promise that the ParAccel offering would provide better price-performance and lower TCO than the current system or the one being evaluated? After all, that is what customers really care about.

I’m confused. More coffee!

Oh, there’s more! Check out this link http://www.paraccel.com/cash_for_clunkers.php

Talk about fine print:

* Trade-in value is equivalent to the first year free of a three year subscription contract based on an annual subscription rate of $15K/user terabyte of data. Servers are purchased separately.

Desktop Email Client vs. GMail: Why desktop mail clients are still better than the GMail interface

Joe Kissell writes in CIO magazine about the six reasons why desktop email clients still rule. He opines that he would take a desktop email client any day and provides the following reason, and six more:

Well, there is the issue of outages like the one Gmail experienced this week. I like to be able to access my e-mail whenever I want. But beyond that, webmail still lags far behind desktop clients in several key areas.

Much has been written by many on this subject. As long ago as 2005, Cedric pronounced his verdict. Brad Shorr had a more measured comparison that I read before I made the switch about a month ago. Lifehacker pronounced the definitive comparison (IMHO it fell flat, their verdicts were shallow). Rakesh Agarwal presented some good insights and suggestions.

I read all of this and tried to decide what to do about a month ago. Here is a description of my usage.

My Usage

1. Email Accounts

I have about a dozen. Some are through GMail, some are on domains that I own, one is at Yahoo, one at Hotmail and then there are a smattering of others like aol.com, ZoHo and mail.com. While employed (currently a free agent) I have always had an Exchange Server to look at as well.

2. Email volume

Excluding work related email, I receive about 20 or 30 messages a day (after eliminating SPAM).

3. Contacts

I have about 1200 contacts in my address book.

4. Mobile device

I have a Windows Mobile based phone and I use it for calendaring, email and as a telephone. I like to keep my complete contact list on my phone.

5. Access to Email

I am NOT a Power-User who keeps reading email all the time (there are some who will challenge this). If I can read my email on my phone, I’m usually happy. But, I prefer a big screen view when possible.

6. I like to use instant messengers. Since I have accounts on AOL IM, Y!, HotMail and Google, I use a single application that supports all the flavors of IM.

Seems simple enough, right? Think again. Here is why, after migrating entirely to GMail, I have switched back to a desktop client.

The Problem

1. Google Calendar and Contact Synchronization is a load of crap.

Google does somethings well. GMail (the mail and parts of the interface) are one of these things. They support POP and IMAP, they support consolidation of accounts through POP or IMAP, they allow email to be sent as if from another account. They are far ahead of the rest. With Google Labs you can get a pretty slick interface. But, Calendar and Contact Synchronization really suck.

For example, I start off with 1200 contacts and synchronize my mobile device with Google. How do I do it? By creating an Exchange Server called m.google.com and having it provide Calendar and Contacts. You can read about that here. After synchronizing the two, I had 1240 or so contacts on my phone. Ok, I had sent email to 40 people through GMail who were not in my address book. Great!

Then I changed one persons email address and the wheels came off the train. It tried to synchronize everything and ended up with some errors.

I started off with about 120 entries in my calendar after synchronizing every hour, I now have 270 or so. Well, each time it felt that contacts had been changed, it refreshed them and I now have seventeen appointments tomorrow saying it is someones birthday. Really, do I get extra cake or something?

2. Google Chat and Contact Synchronization don’t work well together.

After synchronizing contacts my Google Chat went to hell in a hand-basket. There’s no way to tell why, I just don’t see anyone in my Google Chat window any more.

Google does some things well. The GMail server side is one of them. As Bing points out, Google Search returns tons of crap (not that Bing does much better). Calendar, Contacts and Chat are still not in the “does well” category.

So, it is back to Outlook Calendar and Contacts and POP Email. I will get all the email to land in my GMail account though, nice backup and all that. But GMail Web interface, bye-bye. Outlook 2007 here I come, again.

The best of both worlds

The stable interface between a phone and Outlook, a stable calendar, contacts and email interface (hangs from time to time but the button still works), and a nice online backup at Google. And, if I’m at a PC other than my own, the web interface works in a pinch.

POP all mail from a variety of accounts into one GMail account and access just that one account from both the telephone and the desktop client. And install that IM client application again.

What do I lose? The threaded message format that GMail has (that I don’t like). Yippie!

ParAccel TPC-H 30TB results challenged!

Before I had my morning cup of coffee, I found an email message with the subject “ParAccel ADVISORY” sitting in my mail box. Now, I’m not exactly sure why I got this message from Renee Deger of GlobalFluency so my first suspicion was that this was a scam and that someone in Nigeria would be giving me a million dollars if I did something.

But, I was disappointed. Renee Deger is not a Nigerian bank tycoon who will make me rich. In fact, ParAccel’s own blog indicates that their 30TB results have been challenged.

We wanted you to hear it from us first.  Our TPC-H Benchmark for performance and price-performance at the 30-terabyte scale was taken down following a challenge by one of our competitors and a review by the TPC.  We executed this benchmark in collaboration with Sun and under the watch of a highly qualified and experienced auditor.   Although it has been around for many years, the TPC-H specification is still subject to interpretation, and our interpretation of some of the points raised was not accepted by the TPC review board.

None of these items impacts our actual performance, which is still the best in the industry.  We will rerun the benchmark at our earliest opportunity according to the interpretations set forth by the TPC review board. We remain committed to the organization, as outlined in a blog post by Barry Zane, our CTO, here: http://paraccel.com/data_warehouse_blog/?p=74#more-74.

Please see the company blog for a post by David Steinhoff in the office of the CTO for further info: http://paraccel.com/data_warehouse_blog/?p=104#more-104

I read David Steinhoff’s blog as well. He writes

This last week, our June 2009 30TB results were challenged and found to be in violation of certain TPC rules. Accordingly, our results will soon be taken down from the TPC website.

We published our results in good faith. We used our standard, customer available database to run the benchmark (we wanted the benchmark to reflect the incredible performance our customers receive). However, the body of TPC rules is complex and undergoes constant interpretation; we are still relatively new to the benchmark game and are still learning, and we made some mistakes.

While we cannot get into the details of the challenges to our results (TPC proceedings are confidential and we would be in violation if we did), we can state with confidence that our query performance was in no way enhanced by the items that were challenged.

We can also say with confidence that we will publish again, soon.

Now, some competitor or competitor loyalist may try to make more of this than there is … we all know there is the risk of tabloid frenzy around any adversity in a society with free speech … and we wouldn’t have it any other way.

It is unfortunate that the proceedings are confidential and cannot be shared. I hope you republish your results at 30TB.

Contrary to some a long list of pundits, I believe that these benchmarks have an important place in the marketing of a product and its capabilities.

I reiterate what I said in a previous blog post

ParAccel’s solution is based on high-performance trilithium crystals. (Note: I don’t know why this wasn’t disclosed in the full disclosure report).

I hope the challenge was not about the trilithium crystals and the fact that you didn’t disclose it in the full disclosure report.

A meaningful networking platform for children with special needs.

My friend Jon Erickson and his wife Kristin have been working on Parlerai for some time now. As parents of a child with special needs, Kristin and Jon were frustrated by a lack of available tools for parents like us to track and share information and ultimately collaborate with the people making a difference in their daughter’s life. They have built the Parlerai platform which will profoundly and positively impact the lives of the individuals in their family, and the lives of parents around the world.

Parlerai creates a secure network of family, friends and caregivers surrounding a child with special needs.

Parlerai creates a secure network of family, friends and caregivers surrounding a child with special needs and uses innovative and highly personalized tools to enhance collaboration and provide a highly secure method of communicating via the Internet. There are tools for parents, tools for children and tools for caregivers. As the world’s first Augmentative Collaboration service, Parlerai (French for “shall speak) gives children with special needs a voice.

Parlerai is for children with special needs. They use it to communicate and access online media and information in a safe environment.

Parlerai is for parents of children with special needs. They use it to share and communicate information about their children. It gives them a single medium through which they can collaborate with others who are part of their childrens world.

Parlerai is for consultants, educators and caregivers of children with special needs. With Parlerai, they are able to collaborate with others who are part of the childrens world.

In an interview with Dana Puglisi, Kristin describes the value of Parlerai

“For consultants, educators, and other caregivers, imagine trying to make a difference in a child’s life but only seeing that child for half an hour each week. How do you really get to know that child? How can you make the most impact? Now imagine having access to information provided by others who work with that child – her ski instructor, her physical therapist, her grandmother, her babysitter. Current information, consistent information. Imagine how much more you could learn about that child. Imagine how much greater your impact could be.”

Here is what you can do to help

I read some statistics about children with special needs some months ago. Based on those numbers, each and every one of us knows of at least three people whose children have special needs. Parlerai is a system that could profoundly change the lives of these children and their parents and caregivers. Do your part, and spread the word about Parlerai. You can make the difference.

Spread the word, visit http://tinyurl.com/parlerai (a link to this blog post) or http://www.parlerai.com.

More information is also available at Jon’s blog at http://jon-erickson.blogspot.com/ and Kristin’s blog at http://kristingerickson.blogspot.com/

Facebook’s OpenID integration is not very useful!

Preamble:

I don’t know a damn thing about OpenID and less about web applications, but I do know a thing about security, authentication and the like. And, I am a facebook user and like most other internet consumers in this day and age, I am not thrilled that I have to remember a whole bunch of different user names and passwords for each and every online location that I visit.

Facebook’s OpenID integration

Once, and for one last time, you login to facebook with your existing credentials. Let’s say that is your username <joe@joeblow.com> and then you go over to Settings and create your OpenID as a Linked Account. In the interests of full disclosure, I am still working with Gary Krall of Verisign who posted a comment on my previous post describing problems with this linking process. I am sure that we will get that squared away and I can get the linking to work.

Once this linkage is created, a cookie is deposited on your machine indicating that authentication is by OpenID. You wake up in the morning, power up your PC and launch your browser and login to your OpenID provider, and in a second tab, you wander over to www.facebook.com.

The way it is supposed work is this, something looks at the OpenID cookie deposited earlier and uses that to perform your validation.

Are you nuts?

As I said earlier, I don’t know a lot about building Web Applications. But, methinks the sensible way to do this is a little different from the way facebook is doing things.

Look, for example, at news.ycombinator.com. On the login screen, below the boxes for username and password is a button for other authentication mechanisms. If you click that, you can enter your OpenID URL and voila, you are on your way. No permanent cookies involved.

Now, if you didn’t have your morning Joe, and you went directly to news.ycombinator.com and tried to enter your OpenID name, you are promptly forwarded to your OpenID providers page to ask for authentication. Over, end of story. No permanent cookies involved.

Ok, just to verify, I did this …

I went to a friends PC, never used it before, pointed his browser (firefox) to news.ycombinator.com, clicked the button under login/password, entered my OpenID name and sure enough it vectored over to Verisign Labs. I logged in and voila, I’m on Hacker News.

Am I missing something? It sounds to me like facebook is paying lip service to OpenID. Either that or they just don’t get it?

A farewell to facts, rigor and civility

I read this article from Mike (The Fein Line) Feinstein’s blog and it occurs to me that we have collectively chosen to ignore facts, rigor and civility.

For a whole bunch of reasons (beyond the one that Mike mentions), I have to say that I too am happy and proud that I live in Massachusetts.

OpenID first impressions

I have been meaning to try OpenID for some time now and I just noticed that they were doing a free TFA (what they call VIP Credentials) thing for mobile devices so I decided to give it a shot.

I picked Verisign’s OpenID offering; in the past I had a certificate (document signing) from Verisign and I liked the whole process so I guess that tipped the scales in Verisign’s favor.

The registration was a piece of cake, downloading the credential generator to my phone and linking it to my account was a breeze. They offer a File Vault (2GB) free with every account (Hey Google, did you hear that?) and I gave that a shot.

I created a second OpenID and linked it to the same mobile credential generator (very cool). Then I figured out what to do if my cell phone (and mobile credential generator were to be lost or misplaced), it was all very easy. Seemed too good to be true!

And, it was.

Facebook allows one to use an external ID for authentication. Go to Account Settings and Linked Accounts and you can setup the linkage. Cool, let’s give that a shot!

Facebook OpenID failure

Facebook OpenID failure

So much for that. I have an OpenID, anyone have a site I could use it on?

Oh yes! I could login to Verisignlabs with my OpenID :)

Update:

I tried to link my existing “Hacker News” (news.ycombinator.com) account with OpenID and after authenticating with verisign, I got to a page that asked me to enter my HN information which I did.

I ended up with a page: http://news.ycombinator.com/openid_merge and a single word “Unknown” on the screen.

I’ve got to be doing something wrong. Someone care to tell me how badly messed up I am?

Update (sept 11)

Thanks to help from Gary (who commented on this post), I tried the “linking” on Facebook again and this time it worked a little better.

But, I still have to enter my password when I want to login to facebook. Something is still not working the way it should.

Still the same issue with Hacker News.

What I learnt from the GMAIL outage

We all have heard about it, many of us (most of us) were affected by it, some of us actually saw it. This makes it a fertile subject for conversation; in person and over a cold pint, or online. I have read at least a dozen blog posts that explain why the GMAIL outage underscores the weakness of, and the reason for imminent failure of cloud computing. I have read at least two who explain why this outage proves the point that enterprises must have their own mail servers.  There are graphs showing the number of tweets at various phases of the outage. There are articles about whether GMAIL users can sue Google over this failure.

The best three quotes I have read in the aftermath of the Gmail outage are these:

“So by the end of next May, we should start seeing the first of the Google Outage babies being born.” – Carla Levy, Systems Analyst

“Now I don’t look so silly for never signing up for an e-mail address, do I?” – Eric Newman, Pile-Driver Operator

“Remember the time when 150 million people couldn’t use Gmail for nearly ten years? From 1993–2003? And every year before that? Unimaginable.” – Adam Carmody, Safe Installer

Admittedly, all three came from “The Onion“.

This article is about none of those things. To me, the GMAIL outage could not have come at a better time. I have just finished reconfiguring where my mail goes and how it gets there. The outage gave me a chance to make sure that all the links worked well.

I have a GMAIL account and I have email that comes to other (non-GMAIL) addresses. I use GMAIL as a catcher for the non-GMAIL addresses using the “Imports and Forwarding” capability of GMAIL. That gives me a single web based portal to all of my email. The email is also POP3′ed down to a PC, the one which I am using to write this blog post. I get to read email on my phone (using its POP3 capability) from my GMAIL account. Google is a great backup facility, a nice web interface, and a single place where I can get all of my email. And, if for any reason it were to go kaput, as it did on the 1st, in a pinch, I can get to the stuff in a second or even a third place.

But, more importantly, if GMAIL is unavailable for 100 minutes, who gives a crap. Technology will fail. We try to make it better but it will still fail from time to time. Making a big hoopla about it is just plain dumb. On the other hand, an individual could lose access to his or her GMAIL for a whole bunch of reasons; not just because Google had an outage. Learn to live with it.

So what did I learn from the GMAIL outage? It gave me a good chance to see a bunch of addicts, and how they behave irrationally when they can’t get their “fix”. I’m a borderline addict myself (I do read email on my phone, as though I get things of such profound importance that instant reaction is a matter of life and death). The GMAIL outage showed me what I would become if I did not take some corrective action.

Technology has given us the means to “shrink the planet” and make a tightly interconnected world. With a few keystrokes, I can converse with a person next door, in the next state or half way across the world. Connectivity is making us accessible everywhere; in our homes, workplaces, cars, and now, even in an aircraft. It has given us the ability to inundate ourselves with information, and many of us have been over-indulging (to the point where it has become unhealthy).

Full Corn Moon

Today was the “Full Corn Moon” and I was lucky enough to get a clear night.

Full Corn Moon, September 4, 2009

Full Corn Moon, September 4, 2009

You can see a larger image by clicking on the picture above.

The moon is barely visible in the first image, hidden by the trees. In the second and the third, it makes it out of the trees just as the sun is setting behind me.

And here I was complaining about Verizon Wireless!

I thought poorly of Verizon Wireless service and features (though I’ve been a customer for a while).

That all changed when I read this.

Hey Verizon, where do I sign up for that two year contract? But, could you give me a cellular data plan with more than 5GB per month please …

Moonrise!

I have never had too much luck with photographs of the moon. That may just have changed :)

Moonrise

Moonrise

Moonrise

Moonrise

Till January 1, 2010, Bye-Bye Linux!

Around the New Year each year, the fact that I am bored silly leads me to do strange things. For the past couple of years, in addition to drinking a lot of Samuel Adams Double Bock or Black Lager, I kick Windows XP, Vista or whatever Redmond has to offer and install Linux on my laptop.

For two years now, Ubuntu has been the linux of choice. New Year 2009 saw me installing 8.10 (Ignorant Ignoramus) and later upgrading to 9.04 (Jibbering Jackass). But, I write this blog post on my Windows XP (Service Pack 3) powered machine.

Why the change, you ask?

This has arguably been the longest stint with Linux. In the past (2007) it didn’t stay on the PC long enough to make it into work after the New Year holiday. In 2008, it lasted two or three weeks. In 2009, it lasted till the middle of August! Clearly, Linux (and Ubuntu has been a great part of this) has come a very long way towards being a mainstream replacement for Windows.

But, my benchmark for ease of use still remains:

  1. Ease of initial installation
    • On Windows, stick a CD in the drive and wait 2 hours
    • On Linux, stick a CD in the drive and wait 20 minutes
    • Click mouse and enter some basic data along the way
  2. Ease of setup, initial software update, adding basic software that is not part of the default distribution
    • On Windows, VMWare (to run linux), Anti-Virus, Adobe things (Acrobat, Flash, …)
    • On Linux, VMWare (to run windows), Adobe things
  3. Ease of installing and configuring required additional “stuff”, additional drivers
    • printers
    • wacom bamboo tablet
    • synchronization with PDA (Windows ActiveSync, Linux <sigh>)
    • On Windows, DELL drivers for chipset, display, sound card, pointer, …
  4. Configuring Display
    • resolution, alignment
  5. Configuring Mouse and Buttons
  6. Making sure that docking station works
    • On Windows, DELL has some software to help with this
    • On Linux, pull your hair out
  7. Setting Power properties for maximum battery life
    • On Windows, what a pain
    • On Linux, CPU Performance Applet
  8. Making sure that I login and can work as a non-dangerous user
    • On Windows, group = Users
    • On Linux, one who can not administer the system, no root privileges
  9. Setup VPN
    • On Windows, CISCO VPN Client most often. Install it and watch your PC demonstrate all the blue pixels on the screen
    • On Linux, go through the gyrations of downloading Cisco VPN client from three places, reading 14 blogs, web pages and how-to’s on getting the right patches, finding the compilers aren’t on your system, finding that ‘patch’ and system headers are not there either. Finally, realizing that you forgot to save the .pcf file before you blew Windows away so calling IT manager on New Year’s day and wishing him Happy New Year, and oh, by the way, could you send me the .pcf file (Thanks Ed).
  10. Setup Email and other Office Applications
    • On Linux, installing a Windows VM with all of the Office suite and Outlook
    • On Windows, installing all of the Office suite and Outlook and getting all the service packs
    • Install subversion (got to have everything under version control). There’s even a cool command line subversion client for Windows (Slik Subversion 1.6.4)
  11. Migrate Mozilla profile to new platform
    • Did you know that you can literally take .mozilla and copy it to someplace in %userprofile% or vice-versa and things just work? Way cool! Try that with Internet Exploder!
  12. Restore SVN dump from old platform

OK, so I liked Linux for the past 8 months. GIMP is wonderful, the Bamboo tablet (almost just works ), system booted really fast, … I can go on and on.

But, some things that really annoyed me with Linux over the past 8 months

  • Printing to the Xerox multi function 7335 printer and being able to do color, double sided, stapling etc., The setup is not for the faint hearted
  • Could I please get the docking station to work?
  • Could you please make the new Mozilla part of the updates? If not, I have Firefox and Shrill-kokatoo or whatever the new thing is called. What a load of horse-manure that upgrade turned out to be. On Windows, it was a breeze. Really, open-source-brethren, could you chat amongst yourselves?

But the final straw was that I was visiting a friend in Boston and wanted to whip out a presentation and show him what I’d been up to. External display is not an easy thing to do. First you have to change resolutions, then restart X, then crawl through a minefield, sing the national anthem backwards while holding your nose. Throughout this “setup”, you have to be explaining that it is a Linux thing.

Sorry folks, you aren’t ready for mainstream laptop use, yet. But, you’ve made wonderful improvement since 2007. I can’t wait till December 31, 2009 to try this all over again with Ubuntu 9.10 (Kickass Knickerbockers).

OpenDNS again

I wasn’t about to try diddling the router at 11:30 last night but it seemed like a no-brainer to test out this OpenDNS service.

So, look at the little button below. If it says “You’re using OpenDNS” then clearly someone in your network (your local PC, router, DNS, ISP, …) is using OpenDNS. The “code” for this button is simplicity itself

<a title="Use OpenDNS to make your Internet faster, safer, and smarter." href="http://www.opendns.com/share/">
     <img style="border:0;"
          src="http://images.opendns.com/buttons/use_opendns_155x52.gif"
          alt="Use OpenDNS" width="155" height="52" />
</a>

So, if images.opendns.com was sent to my ISP, it would likely resolve in one way and if it was sent to OpenDNS, it would resolve a different way. That means that the image retrieved would differ based on whether you are using OpenDNS or not.

Use OpenDNS

Step 1: Setup was trivial. I logged in to my router and deposited the DNS server addresses and hit “Apply”. The router did its usual thing and voila, I was using OpenDNS.

Step 2: Setup an account on OpenDNS. Easy, provide an email address and add a network. In this case, it correctly detected the public IP address of my router and populated its page. It said it would take 3 minutes to propagate within the OpenDNS network. There’s an email confirmation, you click on a link, you know the drill.

Step 3: Setup stats (default: disabled)

All easy, web page resolution seems to be working OK. Let me go and look at those stats they talk about. (Click on the picture below to see it at full size).

Stats don't seem to quite work!

Stats don't seem to quite work!

August 17th you say? Today is September 1st. I guess tracking 16 billion or so DNS queries for two days in a row is a little too much for their infrastructure. I can suggest a database or two that would not break a sweat with that kind of data inflow rate.

July 30, 2009: OpenDNS announces that for the first time ever, it successfully resolves more than 16 Billion DNS queries for 2 days in a row.

(source: http://www.opendns.com/about/overview/)

So far, so good. I’ve got to see what this new toy can do :) Let’s see what other damage this thing can cause.

Content Filtering

Nice, they support content filtering as part of the setup. That could be useful. Right now, I reduce annoyances on my browsing experience with a suitably crafted “hosts” file (Windows Users: %SYSTEMROOT%\system32\drivers\etc\hosts).

127.0.0.1       localhost
127.0.0.1       ad.doubleclick.net
127.0.0.1       hitbox.com
127.0.0.1       ai.hitbox.com
127.0.0.1       googleads.g.doubleclick.net
127.0.0.1       ads.gigaom.com
127.0.0.1       ads.pheedo.com
[... and so on ...]

I guess I can push this “goodness” over to OpenDNS and reduce the pop-up crap that everyone will get at home. (click on image for a higher resolution version of the screen shot)

Content Filtering on OpenDNS

Content Filtering on OpenDNS

Multiple Networks!

Very cool! I can setup multiple networks as part of a single user profile. So, my phone and my home router could both end up being protected by my OpenDNS profile.

I wonder how that would work when I’m in a location that hands out a non-routable DHCP address; such as at a workplace. I guess the first person to register the public IP of the workplace will see traffic for everyone in the workplace with a per-PC OpenDNS setting that shares the same public IP address? Unclear, that may be awkward.

Enabling OpenDNS on a Per-PC basis.

In last nights post, I had questioned the rationale of enabling OpenDNS on a per-PC basis. I guess there is some value to this because OpenDNS provides me a way to influence the name resolution process. And, if I were to push content filtering onto OpenDNS, then I would like to get the same content filtering when I was not at home; e.g. at work, at Starbucks, …

I’m sure that over-anxious-parents-who-knew-a-thing-or-two-about-PC’s could load the “Dynamic IP” updater thing on a PC and change the DNS entries to point to OpenDNS before junior went away to college :)

So, I guess that per-PC OpenDNS settings may make some sense; it would be nice to have an easy way to enable this when required. I guess that is a fun project to work on one of these days when I’m at Starbucks.

Jeremiah says, “I do it on a per computer basis because I occasionally need to disable it. (Mac OS X makes this super quick with Locations)”. Jeremiah, please do tell why you occasionally need to disable it. Does something fail?

Other uses of OpenDNS

kuzux writes in response to my previous post that OpenDNS can be used to get around restrictive ISP’s. That is interesting because the ISP’s that have put these restrictions in place are likely only blocking name resolution and not connection and traffic. Further, the ISP’s could just as well find the IP addresses of the sites like OpenDNS and put a damper on the festivities. And, one does not have to look far to get the IP addresses of the OpenDNS servers :)

Two thoughts come to mind. First, if the authorities (in Turkey as Kuzux said) put the screws on OpenDNS, would they pour out the DNS lookup logs for specific IP addresses that they cared about (both source and destination). Second, a hypothetical country with huge manufacturing operations, a less stellar human rights record, and a huge number of take-out restaurants all over the US (that shall remain nameless), could take a dim view of a foreigner who had OpenDNS on his/her laptop and was able to access “blocked” content.

Other comments

Janitha Karunaratne writes in response to my previous post that, “Lot of times if it’s a locked down limited network, they will intercept all DNS traffic, so using OpenDNS won’t help (their own default DNS server will reply no matter which DNS server you try to reach)”. I guess I don’t understand how that could be. When a machine attempts a DNS lookup, it addresses the packet specifically to the DNS server that it is targeting. Are you suggesting that these “locked down limited networks” will intercept that packet, redirect it to the in-house DNS server and have it respond?

David Ulevitch (aka Founder and CTO of OpenDNS) writes, “Yeah, there are all kinds of reasons people use our service. Speed, safety, security, reliability… I do tests when I travel, and have even done it with GoGo on a VA flight and we consistently outperform”. Mr. Ulevitch, your product is wonderful and easy to use. Very cool. But, I wonder about this performance claim. When I am traveling, potentially sitting in an airport lounge, a hotel room, a coffee shop or in a train using GPRS based internet service with unknown bandwidth, is the DNS lookup a significant part of the response time to a page refresh, mail message download, (insert activity of your choice)?

My Point of View

It seems to work, it can’t hurt to use it at home (if my ISP has a problem with it, they can block traffic to the IP address). It doesn’t seem to be appreciably faster or slower than my ISP’s DNS. I’ll give it a shot for a while and see what the statistics say (when they get around to updating them).

OpenDNS is certainly an easy to use, non-disruptive service and is worth giving a shot. If you use the free version of OpenDNS (ie don’t create an account, just point to their name servers), there is little possible downside; if you get on a Virgin Atlantic flight, you may need to disable it. But, if you use the registered site, just remember that OpenDNS is collecting a treasure trove of information about where you go, what you do, and they have your email address, IP address (hence a pretty good idea of where you live). They already target advertising to you on the default landing page for bad lookups. I’m not suggesting that you will get spammed up the wazoo but just bear in mind that you have yet another place where a wealth of information about you is getting quietly stored away for later use.

But, it is a cool idea. Give it a shot.

OpenDNS and paid wireless services.

I stumbled on this post, not sure how.

http://thegongshow.tumblr.com/post/176629519/virgin-america-inflight-wireless

I didn’t know about OpenDNS but it seems like a strange idea; after all, why would someone not use the DNS server that came with their DHCP lease? Seems like a fairly simple idea, over-ride the DNS servers and force the choice to be to use the DNS servers that OpenDNS provides, but why would anyone care to do this on a PC by PC basis? That seems just totally illogical!

And the problem that the author of the blog (above) mentioned is fairly straightforward. PC comes on the Virgin Wireless network, attempts to go to a web page, sends out a connection request for a page (assume that it is from a cache of IP addresses) and receives a HTML redirect to a named page (asking one to cough up money). The HTML redirect mentions the page by name and that name is not cached and results in a DNS lookup which goes to OpenDNS. Those servers (hardcoded into the network configuration) are not accessible because the user has not yet coughed up said money. Conversely, If the initial lookup was not from a cached IP address then the DNS lookup (which would have been sent to OpenDNS) would have not made it very far (OpenDNS’s server is not reachable till you cough up money). One way or the other, the page asking the user to cough up cache would not have showed up.

So, could one really do this OpenDNS thing behind a pay-for-internet-service? Not unless you can add preferred DNS servers to network configuration without a reboot. (the reboot will get the cycle of DHCP request going again, and on the high volume WiFi access services, DHCP request will automatically expire the previous lease).

But, to the more basic issue, why ever would someone enable this on a PC-by-PC basis? I can totally understand a system administrator using this at an enterprise level; potentially using the OpenDNS server instead of the ones that the ISP provided. Sure beats me! And there sure can’t be enough money in showing advertisements on the redirect page for missing URL’s (or there must be a lot of people who fat-finger URL’s a lot more than I do).

And, the functionality of being able to watch my OpenDNS traffic on a dashboard, I just don’t get it. Definitely something to think more about … They sure seem to be a successful operation so there must clearly be some value to the service they offer, to someone.

When “free” isn’t quite “free”. Or, remember to read the fine print on online photo sharing sites.

I have been a happy user of Flickr (I use the “frugal” account) till yesterday when I got a big pop-up box about a 200 image limit. Apparently flickr does offer unlimited free image storage but the fine print says that only the most recent 200 can be shared. Not the worst thing in the world but I began to think of all the other things I did not like about Flickr and so I started to look around and see whether I had some other alternatives.

There are any number of online image sharing sites. Many of them are also linked with photo printing services, and arguably that is where the money is. It appears that the devil is most certainly hiding in the fine print. Here is a short summary of what I found. I’m planning to move to Shutterfly; do you have some experience with them which makes this a bad idea?

Free Account

Paid Accounts

Flickr

http://www.flickr.com/help/limits/

Free account has unlimited (modest resolution) image
storage. No more than 200 can be shared at a time. There are upload limits on the number and size of pictures that you can upload.

Unlimited upload, storage and high resolution storage.

$24.95 per year

Photobucket

http://photobucket.com/faq?catID=29&catSelected=f&topicID=320

http://photobucket.com/faq?catID=41&catSelected=f&topicID=323

http://photobucket.com/faq?catID=39&catSelected=f&topicID=520

Free account has limited (modest resolution) images.

Periodic logins are required; failure to do so will deactivate
media

Unlimited capacity, FTP uploads (what a concept), no
advertising, personal URL’s

$24.95 per year

Shutterfly

http://www.shutterfly.com

Free unlimited storage of pictures. Images stored at high resolution. But you cannot download at high resolution. Personalized web portal.

I don’t think they even offer a paid account option. My
kind of place!

Snapfish (HP)

References

http://getsatisfaction.com/snapfish/topics/downloading_snapfish_photos

http://www1.snapfish.com/helppricing#hires

Snapfish offers unlimited photo sharing and storage. Customers must be “active”. The bar for an active customer is that you just need to
make one purchase a year.

But read this
link
. You can store high resolution images but downloading high
resolution images is not free.

OUCH!

No paid account.

Picasa (Google)

1GB limit (seems odd for the company that claims that
storage is unlimited).

Other limits also apply.

No paid offering that I could find.

Smugmug

No ads! Now, isn’t this a great graphic to illustrate the  success in targeted advertising and reinforce their point?

Smug Mug No Ads!

Smug Mug "No Ads!"

Reference

http://www.smugmug.com/photos/photo-sharing-sites-compared/

No free offering. There is a 14 day free trial.

Standard: $39.95/year

Power: $59.95/year

Pro: $149.95/year

Winkflash

http://www.winkflash.com/

http://www.winkflash.com/content/storage.asp

Free image hosting.

Free unlimited storage.

Free high resolution image downloads.

100% FREE

Too good to be true?

MPIX

http://mpix.com/

Free site for 60 days. After that, need an order to keep
content online.

MPIX is a professional print outfit; online sharing is not
their primary business.

WHCC (White House Custom Color)

http://www.whcc.com/

WHCC is a professional print outfit. They don’t do photo
galleries and sharing stuff. Go here if you want serious prints.

Not offered.

Not offered.

I think I’m heading to shutterfly. On Sept 5th I found this Shutterfly article (answer id 181)

“Currently, we do not have full resolution downloading available. However, we do have an Archive DVD service that you can order which contains full resolution copies of your pictures. The images on an Archive DVD will not include any of the Shutterfly enhancements or rotations that have been applied to an image loaded to Shutterfly.”

What BS is this? I guess I’m going to stick with Flickr for a while longer.

Also read http://forums.dpreview.com/forums/read.asp?forum=1023&message=32634142

Is my lens busted? a.k.a. How to take pictures from a boat!

Last week, I had a great time out on a friends boat and managed to shoot a lot of pictures. But, when I got home and looked at the days work on the PC monitor, many of the pictures appeared hazy. I have been concerned about lens damage for some time now; and my first suspicion was that the lens was in fact damaged. Here is the raw data for you to look at as well.

Equipment

Nikon D200, 18-200 VR

Configuration

ISO Auto, standard program, exposure compensation +0.0 eV, shooting Fine JPEG and RAW

Some image comparisons

1. A picture of a boat in Quincy (Captains Cove). I was standing on the dock, picture was taken at about 8 feet.

f-9, 1/320s, ISO 100, focal length 20mm.

Mio Padre, Quincy, MA

Mio Padre, Quincy, MA

Mio Padre, Quincy, MA

Mio Padre, Quincy, MA

2. A picture of the Boston Skyline from a moving boat.

f-9, 1/320s, ISO 100, focal length 38mm.

Boston Skyline

Boston Skyline

Boston Skyline

Boston Skyline

Clearly, the lens is able to focus and take sharp images, and I was shooting a fairly fast shutter speed (1/320s) so why is the skyline appearing blurred? I did have the VR enabled on both images so it couldn’t have been vibration.

Camera shake vs. Camera movement

As you can see from the picture of the skyline, the screws were going pretty good, we were probably doing about 15 knots. At 15 knots (about 28 km/h) the camera would have moved about 2cm in 1/320s.  As a result, the lack of clarity was more like “camera shake” than it was distortion.

So, how exactly does one take pictures from a moving boat? Next time, I’ll go for something faster than 1/320s. Considering that the light was good enough for f9 at ISO 100, I could shoot the same picture with an exposure less than 1/1000s and still be fully exposed.

Assuming the same 15 knots, and assuming that the movement was at 90° to the plane of the subject, the camera would still have moved almost a centimeter in the 1 ms that the shutter was open. I suspect that this movement had less to do with the distortion than the side-to-side movement of the boat. After all, the angular change of the 1cm over almost a mile is minuscule but rotating the camera (which is what the side to side movement would do) may have a more profound blurring effect. Till next time then …

P.S.More pictures of this trip are on the flickr stream at http://www.flickr.com/photos/amrithk2000

Making life interesting

I generally don’t like chain letters and SPAM but once in a while I get a real masterpiece. Below is one that I received yesterday …

Working people frequently ask retired people what they do to make their days interesting. Well, for example, the other day the wife and I went into town and went into a shop. We were only in there for about 5 minutes.

When we came out, there was a cop writing out a parking ticket. We went up to him and I said, “Come on man, how about giving a senior citizen a break?”

He ignored us and continued writing the ticket. I called him a Dumb ass. He glared at me and started writing another ticket for having worn tires. So Mary called him a shit head. He finished the second ticket and put it on the windshield with the first.

Then he started writing a third ticket. This went on for about 20 minutes.

The more we abused him, the more tickets he wrote.

Just then our bus arrived and we got on it and went home.

We try to have a little fun each day now that we`re retired.

It`s important at our age.

Priceless!

I’ve lost my cookies!

Some days ago I read an article (a link is on the Breadcrumbs tab on the right of my blog about Super Cookies). I didn’t think to check my machine for these super cookies and did not pay close attention to the lines that said:

GNU-Linux: ~/.macromedia

Sure enough …

amrith@amrith-laptop:~$ find . -name *.sol 2>/dev/null | wc -l
141
amrith@amrith-laptop:~$

Hmmm …

amrith@amrith-laptop:~$ rm `find . -name *.sol 2>/dev/null `
amrith@amrith-laptop:~$ find . -name *.sol 2>/dev/null | wc -l
0
amrith@amrith-laptop:~$

Much better. And my flash player still works. Let’s go look at some video …

amrith@amrith-laptop:~$ find . -name *.sol 2>/dev/null
./.macromedia/Flash_Player/macromedia.com/support/flashplayer/sys/settings.sol
./.macromedia/Flash_Player/macromedia.com/support/flashplayer/sys/#s.ytimg.com/settings.sol
./.macromedia/Flash_Player/#SharedObjects/CZRS8QS7/s.ytimg.com/soundData.sol
./.macromedia/Flash_Player/#SharedObjects/CZRS8QS7/s.ytimg.com/videostats.sol
amrith@amrith-laptop:~$

Time to fix this sucker …

amrith@amrith-laptop:~$ tail -n 1 .bashrc
rm -f `find ./.macromedia -name *.sol 2>/dev/null`
amrith@amrith-laptop:~$

Boston Cloud Services meetup yesterday

Tsahy Shapsa of aprigo organized the second Boston Cloud Services meetup yesterday. There were two very informative presentations, the first by Riki Fine of EMC on the EMC Atmos project and the second by Craig Halliwell from TwinStrata.

What I learnt was that Atmos was EMC’s entry into the cloud arena. The initial product was a cloud storage offering with some additional functionality over other offerings like Amazon’s. Key product attributes appear to be scalability into the petabytes, policy and object metadata based management, multiple access methods (CIFS/NFS/REST/SOAP), and a common “unified namespace” for the entire managed storage. While the initial offering was for a cloud storage offering, there was a mention of a compute offering in the not too distant future.

In terms of delivery, EMC has setup its own data centers to host some of the Atmos clients. But, they have also partnered with other vendors (AT&T was mentioned) who would provide an cloud storage offerings that exposed the Atmos API. AT&T’s web page reads

AT&T Synaptic Cloud Storage uses the EMC Atmos™ backend to deliver an enterprise-grade global distribution system. The EMC Atmos™ Web Services API is a Web service that allows developers to enable a customized commodity storage system over the Internet, VPNs, or private MPLS connectivity.

I read this as a departure from the approach being taken by the other vendors. I don’t believe that other offerings (Amazon, Azure, …) provide a standardized API and allow others to offer cloud services compliant to that interface. In effect, I see this as an opportunity to create a marketplace for “plug compatible” cloud storage. Assume that a half dozen more vendors begin to offer Atmos based cloud storage, each offering a different location, SLA’s and price point, an end user has the option to pick and choose from that set. To the best of my knowledge, today the best one can do is pick a vendor and then decide where in that vendor’s infrastructure the data would reside.

Atmos also seems to offer some cool resiliency and replication functionality. An application can leverage a collection of Atmos storage providers. Based on policy, an object could be replicated (synchronously or asynchronously) to multiple locations on an Atmos cloud with the options of having some objects only within the firewall and others being replicated outside the firewall.

Enter TwinStrata who are an Atmos partner. They have a cool iSCSI interface to the Atmos cloud storage. With a couple of clicks of a mouse, they demonstrated the creation of a small Atmos based iSCSI block device. Going over to a windows server machine and rescanning disks they found the newly created volume. A couple of clicks later there was a newly minted “T:” that the application could use, just as it would a piece of local storage. TwinStrata provides some additional caching and ease of use features. We saw the “ease of use” part yesterday. The demo lasted a couple of minutes and no more than about a dozen mouse clicks. The version that was demo’ed was the iSCSI interface, there was talk of a file system based interface in the near future.

Right now, all of these offerings are expected to be for Tier-3 storage. Over time, there is a belief that T2 and T1 will also use this kind of infrastructure.

Very cool stuff! If you are in the Boston area and are interested in the Cloud paradigm, definitely check out the next event on Sept 23rd.

Pizza and refreshments were provided by Intuit. If you haven’t noticed, the folks from Intuit are doing a magnificent job fostering these kinds of events all over the Boston Area. I have attended several excellent events that they have sponsored. A great big “Thank You” to them!

Finally, a big “Thank You” to Tsahy and Aprigo for arranging this meetup and offering their premises for the meetings.

Twitter: the beginning of the end?

After a spectacular rise to fame during the recent unrest in Iran, Twitter seems to have reacted to all the publicity and limelight like many failed stars before it (myspace anyone?). With no clear business model, a lawsuit being brought by TechRadium for patent infringement, and actions that can only alienate the developer community (e.g. cease-and-desist against mytwitterbutler.com), I have to think Twitter is heading for big trouble.

It was a great concept: micro-blogging. When it came out, people thought it was a piece of crap (who would bother to send <140 byte things that others would care about) but they were wrong. But, unless Twitter can figure out how to embrace the developer community and foster innovation around its platform, history will not have good things to say about them.

Twitter has exactly two things going for it:

  • the number of registered users on Twitter and the network between those users, and
  • the API that others can use to build applications to leverage the network (and by extension, the network of developers)

Other than that, I wonder what Twitter has? And by alienating its developer community, Twitter may well purchase its one-way ticket to oblivion. Twitter clones are a dime a dozen but they don’t have the network and registered base that Twitter has. Consider the case of the Apple iPhone. It has become so popular, and has exploited the catch-phrase “there’s an app for that” so effectively. But, there’s a huge entry barrier to making a device like the iPhone. Not including all the hurdles that one must pass to get carrier approval, the iPhone is a tangible object that requires a significantly larger investment than a Twitter Clone.

In my opinion, all it takes now, is the creation of a small number of killer-apps on clones platform to cement the fate of Twitter. To use Warren Buffett’s terminology, the iPhone has a “wide moat”. I don’t think the same can be said of Twitter today.

More sunsets

The sunset this evening was quite spectacular. The picture (panoramic) was taken from Prospect Hill Road in Harvard, MA.

Click on the image to see a higher resolution picture on flickr. The image here has been cropped to make sure it shows up properly on the blog interface.

Sunset in Harvard, MA

Sunset in Harvard, MA

Are you getting the whole picture?

While taking pictures, the field of vision is something that is often overlooked. A normal point and click camera has a field of vision of about 40°x35°. But, the human eye(s) provide you with a field of vision that is almost 200°x130°. Very often, you come upon a sight that is breathtaking and you whip out your camera and shoot some pictures. When you get back home and look at the pictures on a PC monitor, they don’t look quite the same.

To get some idea of what a short focus length lens (wide field of vision) can do for you, take a look at this picture on Ken Rockwell’s web page. The image that I would like you to look at is  here. This awesome image is copyrighted by KenRockwell.com. If you are a photo buff, you should bookmark kenrockwell.com and subscribe to the RSS feed. I find it absolutely invaluable.

I don’t have this kind of amazing 13mm lens but a panoramic image using stitching can produce a similar field of view.

Panoramic image of a rainbow

Panoramic image of a rainbow

Panoramic images are a very cost effective way to get pictures with a very wide field of vision. If you are interested in all the science and technology behind the process of converting multiple segments of an image into a single panoramic image, you can refer to the FAQ at AutoStitch. There is an interesting paper on how all this works that you can read here and there is an informative presentation that goes with that paper.

Panoramic images are also better than short focal length lenses because there is less distortion towards the edges. Notice that the houses at the right and left edge of the first image above appear to be leaning. With panoramic stitching these effects can be eliminated.

Some quick tips if you plan to take a panoramic picture.

  1. Set the camera to manual exposure mode to reduce the corrections that need to be done in software.
  2. Use a tripod and make sure that you get a complete coverage of the area that you want to stitch.
  3. Make sure that you overlap images by about a third. I usually turn on the visible grid in the view finder to help with this.
  4. Take lots of pictures, there is nothing to beat practice.

In my previous post some panoramic sunsets were shown. I took several sets of pictures, such as the five below. These were then stitched together using a software called Autostitch. You can get a copy of autostitch at http://www.autostitch.net/

image-1 image-2 image-3 image-4 image-5

Enjoy!

Sunset in Rye, New Hampshire

I got two interesting pictures of the sunset in Rye, NH today. The two pictures were each made by tiling 5 images using autostitch.

The two pictures are five minutes apart and the colors changed quite interestingly in that 5 minute interval.

And about five minutes later

I have reduced the size on these images to 70% so that they show up ok on the blog page. Larger images are on flickr (use the left panel).

Posted in Pictures. Tags: . 2 Comments »

Look Ma! NoSQL!

There has definitely been more chatter about NoSQL in the Boston area lately. I hear there is a group forming around NoSQL ( I will post more details when I get them ). There were some NoSQL folks at the recent Cloud Camp which I was not able to attend (damn!).

My views on NoSQL are unchanged from an earlier post on the subject. I think there are some genuine issues about database scaling that are being addressed through a variety of avenues (packages, tools, …). But, in the end, the reason that SQL has survived for so long is because it is a descriptive language that is reasonably portable. That is also the reason why, in the data warehousing space, you have each vendor going off and doing some non-SQL extension in a totally non-portable way. And they are all going to, IMHO, have to reconcile their differences before the features get wide mainstream adoption.

This morning I read a well researched blog post by BJ Clark by way of Hacker News. (If you don’t use HN, you should definitely give it a try).

I strongly recommend that if you are interested in NoSQL, you read the conclusion section carefully. I have annotated the conclusion section below.

“NoSQL is a great tool, but it’s certainly not going to be your competitive edge, it’s not going to make your app hot, and most of all, your users won’t give a shit about any of this.

What am I going to build my next app on? Probably Postgres.

Will I use NoSQL? Maybe. [I would not, but that may just be my bias]

I might keep everything in flat files. [Yikes! If I had to do this, I'd consider something like MySQL CSV first]


If I need reporting, I won’t be using any NoSQL.

If I need ACIDity, I won’t use NoSQL.

If I need transactions, I’ll use Postgres.

…”

NoSQL is a great stepping stone, what comes next will be really exciting technology. If what we need is a database that scales, let’s go make ourselves a database that scales. Base it on MySQL, PostgreSQL, … but please make it SQL based. Extend SQL if you have to. I really do like to be able to coexist with the rich ecosystem of visualization tools, reporting tools, dashboards, … you get the picture.


Posted in Odds and Ends. Tags: . 2 Comments »

A conversation with Rep. Brownsberger

Recently I exchanged some email with Rep. Will Brownsberger on the subject of non-competes. I reproduce below the essence of the email I sent him.

Companies that require three founders and can get to a working prototype in three months on ~ $18,000 are being funded in large numbers by a variety of incubation firms. The risk with this approach is that the core technologies and intellectual property / innovation that is being done is not highly specialized and the work can easily be outsourced out of the US. More and more companies (all over the country) are being setup with their core R&D team offshore where fungible, low cost, less experienced developers are freely available.

On the other hand, companies and industries that have showed good gains in booms and busts are the ones that take time, capital and ingenuity to build. And those are, the kinds of companies that will provide longer term benefits to the region in the form of jobs, a skilled workforce, and a stable and highly paid tax base . Companies like that hire on both coasts to tap into the talent pool available across the country.

It is therefore in the interest of Massachusetts to encourage industries to participate and work along with educational institutions, attract the best and brightest students and encourage them to stay. That, I believe is the engine that drove Silicon Valley and Massachusetts; I hazard a guess that per square mile and per capita, MA and CA have the highest number of institution of higher learning in the sciences and technology related fields. Those investments many years ago are, I believe, that brought these two regions the lion’s share of the innovation economy.

In my opinion, measures to increase the co-operation between educational institutions and industry would do more to foster innovation and spur growth in the region than the ill-advised battle with non-competes.

There are bigger concerns for the high technology environment in Massachusetts, and for that matter in the United States than the issue of non-competes

Some other perspectives on the non-compete discussion

I read two other well written perspectives on the non-compete debate.

The first is a letter from George F. Colony (Chairman and CEO, Forrester Research, Inc.)  to Rep. Will Brownsberger. You can read the entire letter at http://willbrownsberger.com/index.php/archives/2251

In it, he writes

In 26 years, I have yet to see or hear of any empirical proof that supports the elimination of non-compete contracts. Do a search and you’ll see that all references are either anecdotal or hypothetical. Non-competes are one of the reasons businesses should start in or continue to call Massachusetts their home.

The second article was in Mass High Tech, a link to the article is here. In this article Jackie Noblett quotes Paul Dacier, general counsel of Hopkinton-based EMC Corp.

“Massachusetts law is clear on its face and unambiguous when it comes to noncompetes, and any form of legislation that tries to change that I think is inappropriate.”

I do not agree with this point of view, I think there are good reasons for a change.  The article goes on to say

Dacier and other executives argue the voluntary nature of the agreements — not all companies have to require noncompete agreements and employees don’t have to sign them — makes the point for the status quo.

Follow

Get every new post delivered to your Inbox.