Tuesday, December 6, 2016

In which people discuss things I don't understand

  • Can Uber Ever Deliver? Part One – Understanding Uber’s Bleak Operating Economics
    If rapid growth could not drive major margin improvements between 2012 and 2016, there is no reason to believe that Uber will suddenly find billions in scale economies going forward. Fundamentally digital companies like Amazon, EBay, Google and Facebook had massive operating scale economies because the marginal cost of expanded operations was close to zero. Aggressive pricing fueled the growth that drove major margin improvements and also created major consumer welfare benefits.

    By contrast, in the hundred years since the first motorized taxi, there has been no evidence of significant scale economies in the urban car service industry. That explains why successful operators never expanded to other cities and why there was no natural tendency towards concentration in individual markets. Drivers, vehicles and fuel account for 85% of urban car service costs. None of these costs decline significantly as companies grow. As the P&L data above demonstrates, Uber has not discovered a magical new way to drive down unit costs.

  • Can Uber Ever Deliver? Part Two: Understanding Uber’s Uncompetitive Costs
    Every other transport industry depends on highly centralized management using highly sophisticated systems to ensure that capital assets are highly utilized and tightly scheduled around market demand. The Uber business model implies that all these industries are horribly wrong; decentralizing asset purchasing, maintenance and scheduling to isolated low-wage workers would not only reduce costs, but create an efficiency gain large enough to drive all incumbent operators out of business. No one has produced any economic evidence demonstrating that the Uber view might be correct.
  • Can Uber Ever Deliver? Part Three: Understanding False Claims About Uber’s Innovation and Competitive Advantages
    Hundreds of other consumer industries have migrated from telephone ordering to smartphone and internet ordering (pizza delivery, airline booking), but there is not a single case where this had any material impact on industry competition, much less created tens of billions of dollars in corporate value. The major emphasis on the app in pro-Uber articles appears to be symbolic; the app implies the existence of magically new “on-demand” efficiencies (just push a button and your car appears).

    Highlighting the app also implies that Uber is a “technology company” that has completely “disrupted” industry economics, and is not simply a traditional company like Domino’s Pizza that is utilizing smartphone ordering. Needless to say, none of these articles are written by anyone with actual expertise in ecommerce or urban transportation, and none provide any evidence supporting the claim that the app represents breakthrough technology that gives Uber a powerful competitive advantage.

  • Can Uber Ever Deliver? Part Four: Understanding That Unregulated Monopoly Was Always Uber’s Central Objective
    From its earliest days, Uber’s investors and managers have always recognized that investor returns would require global industry dominance, and the elimination (or effective nullification) of longstanding laws and regulations designed to protect competition, and to protect consumers from the risks of anti-competitive market power[1]. This presumes that urban car services can be turned into a “winner-take-all-game”, where the winner can earn sustainable rents once quasi-monopoly industry dominance has been achieved. Dominance would also allow Uber to leverage its platform in order to expand into other markets that it could not otherwise profitably enter.

Sunday, December 4, 2016

Who says chess is boring?

Check out this marvelous short video on Michael Aigner's site: Tal Meets Qh6 and Carlsen Wins.

Tal, in this case, is not Mikhail Tal, the great Latvian genius who was world champion when I was born, but Tal Baron, the wonderfully-talented young Israeli Grandmaster.

Saturday, December 3, 2016

Keeping up with da netz

Hey, isn't it the holiday season? Aren't things supposed to be quieting down around now?

Apparently not...

  • Infrastructure Update: Pushing the edges of our global performance
    It can take up to 180 milliseconds for data traveling by undersea cables at nearly the speed of light to cross the Pacific Ocean. Data traveling across the Atlantic can take up to 90 milliseconds. This travel time is compounded by the way TCP works. To establish a reliable connection for uploads, the client initiates what’s called a slow start. It sends a few packets of data, then waits for an ACK (or acknowledgement), confirming that the data has been received. The client will then send a larger group of packets and await confirmation, repeating this process until ultimately transmitting data at the user’s full available link capacity. Given the limitations we encounter here—the distance across the Pacific Ocean, and the speed of light—there are only so many optimizations we can make before physics stands in the way.
  • Slicer: Auto-sharding for datacenter applications
    What exactly is Slicer then? It has two key components: a data plane that acts as an affinity-aware load balancer, with affinity managed based on application-specified keys; and a control plane that monitors load and instructs applications processes as to which keys they should be serving at any one point in time. In this way, the decisions regarding how to balance keys across application instances can be outsourced to the Slicer service rather than building this logic over and over again for each individual back-end service. Slicer is focused exclusively on the problem of balancing load across a given set of backend tasks
  • QCon NewYork 2016: The Verification of a Distributed System
    Distributed Systems are difficult to build and test for two main reasons: partial failure & asynchrony. These two realities of distributed systems must be addressed to create a correct system, and often times the resulting systems have a high degree of complexity. Because of this complexity, testing and verifying these systems is critically important. In this talk we will discuss strategies for proving a system is correct, like formal methods, and less strenuous methods of testing which can help increase our confidence that our systems are doing the right thing.
    (Don't miss the awesome list of reference material!)

  • An Approach to Designing Distributed, Fault-Tolerant, Horizontally Scalable Event Scheduler
    For the processing part, a master is elected among the cluster members. Zookeeper could be used for leader/master election, but since BigBen already uses Hazelcast, we used the distributed lock feature to implement a Cluster Singleton. The master then schedules the next bucket and reads the event counts. Knowing the event count and shard size, it can calculate very easily how many shards are in total. The master then creates pairs of (bucket, shard_index) and divides them equally among the cluster members, including itself. In case of unequal division, the master tries to take the minimum load on itself.
  • Hazelcast is the leading open source in-memory data grid.
    If you have programmed applications in Java, you have probably worked with concurrency primitives like the synchronized statement (the intrinsic lock) or the concurrency library that was introduced in Java 5 under java.util.concurrent, such as Executor, Lock and AtomicReference.

    This concurrency functionality is useful if you want to write a Java application that uses multiple threads, but the focus here is to provide synchronization in a single JVM and not distributed synchronization over multiple JVMs. Luckily, Hazelcast provides support for various distributed synchronization primitives such as the ILock, IAtomicLong, etc. Apart from making synchronization between different JVMs possible, these primitives also support high availability: if one machine fails, the primitive remains usable for other JVMs.

  • New – AWS Step Functions – Build Distributed Applications Using Visual Workflows
    Today we are launching AWS Step Functions to allow you to do exactly what I described above. You can coordinate the components of your application as series of steps in a visual workflow. You create state machines in the Step Functions Console to specify and execute the steps of your application at scale.

    Each state machine defines a set of states and the transitions between them. States can be activated sequentially or in parallel; Step Functions will make sure that all parallel states run to completion before moving forward. States perform work, make decisions, and control progress through the state machine.

  • Performance improvements in bcachefs-testing
    btree nodes are log structured, with multiple sorted sets of keys. In memory, we sort/compact as needed so that we never have more than three different sets of keys: the lookup and iterator code has to search through and maintain pointers into each sorted set of keys, so we don't want to deal with too many. Having multiple sorted sets of keys ends up being a performance win, since the result is that only the newest and smallest is being modified at any given time, and the rest are constant - we can construct lookup tables for the constant sets of keys that are drastically more efficient for lookup, but wouldn't be possible to update without regenerating the entire lookup table.
  • Probabilistic Data Structure Showdown: Cuckoo Filters vs. Bloom Filters
    Probabilistic data structures store data compactly with low memory and provide approximate answers to queries about stored data. They are designed to answer queries in a space-efficient manner, which can mean sacrificing accuracy.

    Like Bloom filters, the Cuckoo filter is a probabilistic data structure for testing set membership. The ‘Cuckoo’ in the name comes from the filter’s use of the Cuckoo hashtable as its underlying storage structure. The Cuckoo hashtable is named after the cuckoo bird becauses it leverages the brood parasitic behavior of the bird in its design. Cuckoo birds are known to lay eggs in the nests of other birds, and once an egg hatches, the young bird typically ejects the host’s eggs from the nest. A Cuckoo hash table employs similar behavior in dealing with items to be inserted into occupied 'buckets’ in a Cuckoo hash table.

  • Building robust software with rigorous design documents
    So what, exactly, goes into a design document for a problem domain? What makes these docs so detailed and rigorous? I believe that the hallmark of these designs is an extremely thorough assessment of risk.

    As the owner of a problem domain, you need to look into the future and anticipate everything that could go wrong. Your goal is to identify all of the possible problems that will need to be addressed by your design and implementation. You investigate each of these problems deeply enough to provide a useful explanation of what they mean in your design document. Then you rank these problems as risks based on a combination of severity (low, medium, high) and likelihood (doubtful, potential, definite).

  • How Google Is Challenging AWS
    Still, for all the success Microsoft has had with Office 365, the real giant of cloud computing — which is to say the future of enterprise computing — is, as is so often the case, a company no one saw coming: the same year Google decided to take on Microsoft Amazon launched Amazon Web Services. What makes AWS so compelling is the way that it reflects Amazon itself: it is built for scale and with clearly-defined and hardened interfaces. Customers — first Amazon but also companies around the world — access “primitives” that can be mixed-and-matched to build a more efficient, scalable, and secure back-end than nearly any company could build on its own.


    Where Kubernetes differs from Borg is that it is fully portable: it runs on AWS, it runs on Azure, it runs on the Google Cloud Platform, it runs on on-premise infrastructure, you can even run it in your house. More relevantly to this article, it is the perfect antidote to AWS’ ten year head-start in infrastructure-as-a-service: while Google has made great strides in its own infrastructure offerings, the potential impact of Kubernetes specifically and container-based development broadly is to make irrelevant which infrastructure provider you use. No wonder it is one of the fastest growing open-source projects of all time: there is no lock-in.

  • 52 things I learned in 2016

A map of and for the times

I find myself unreasonably obsessed with The Unscientific Bay Area

Mapping today is dominated by data freaks, obsessed with being scientifically rigorous and statistically significant. But as a data freak I’ve come to realize that not all maps have to involve equations. I want to take a break, be a little unscientific, and put the human element back on the map. Ultimately, cities and neighborhoods are collections of people, and I wanted to map their experiences. As it turns out, these unscientific maps are just as charming, thorough and thought-provoking as any other.

If you load the PNG file for the map into your browser, you can zoom in and scroll around.

And you can lose hours trying to reproduce the process that Trubetskoy must have followed, wandering around Urban Dictionary to find pages like Oakland

City east of SF Bay, aka "tha town". Separated into 3 parts (North, West, and East Oakland). There is no south. North Oakland is the hills. West Oakland has downtown, lake merritt, chinatown, and jack london square. East Oakland has the airport, coliseum, and the zoo. Deep East Oakland is where you can find the sideshows, people actin' a fool and gettin' hyphy, goin stupid doo doo dumb retarded, smokin perk and chewy, sippin' on some heem or yak, and slappin' hard in they box chevs.

(Looks like Trubetskoy mistakenly coded that as "The Town".)

"There is no south", indeed; that direction leads to my home, accurately described as "increasingly yuppie.".

Many of these jargon terms are completely unfamiliar to me. For example, Fruitvale has always been called Fruitvale to me (though I'm an oldster, not hip at all). I've certainly never heard it called East Side Oakland (ESO), though perhaps that term is describing the area where 98th meets East 14th, a bit farther away.

My son will be (perhaps) excited to know that he lives in Haystack, "the heart of the bay"

The city of Hayward, California. It is known as the "heart" of the bay. The city was founded by a man named William Hayward, who came to California to seek his fortune in the California Gold Rush (Began in 1848, I believe). He bought some forty acres from some Rancher, and in a few years sprouted into a town. Was at one point misspelled into "Haywood". Haystack is a slang term when refering to this city.

Person A: I live in tha Haystack
Person B: Um...?
Person A: Hayward
Person B: Where is that?

And I'm sure my daughter would agree that Surf City is indeed known for "awesome local bands."

Anyway, it was an inspired idea, thanks for putting the beautiful map together!

Wednesday, November 30, 2016

And Carlsen it is

Today was the tie-breaker day, and Magnus Carlsen has retained the World Chess Champion title: Magnus Carlsen defeats Sergey Karjakin to retain World Chess Championship – as it happened.

Before the match, I spent a fair amount of time describing Carlsen's astonishing endurance and ability to sustain his concentration over a six, seven, or even eight hour chess game.

But his skill on shorter time frames is even greater.

And, although Karjakin was every bit Carlsen's equal during the standard time control games, today was all Carlsen.

So we move on. As I said, I don't think anyone is pleased that it had to go to tie breaks, but those are the rules and that's the way the match was organized, it was not a surprise that this was a possibility.

Everybody is going to have their own opinions about the match, but overall I was pleased. It was fun chess to watch, and I can't wait for the next match! (Of course, not everyone shares my opinion.)

Tuesday, November 29, 2016

The coming Atlantis

Evidently, the European Space Agency's Sentinel-1 satellites have accurate enough instrumentation that they can actually detect the movement of the Millenium Tower: Satellites confirm sinking of San Francisco tower

To probe these subtle shifts, scientists combined multiple radar scans from the Copernicus Sentinel-1 twin satellites of the same area to detect subtle surface changes – down to millimetres. The technique works well with buildings because they better reflect the radar beam.

Over the weekend, my wife and I were walking along the shore of the bay, approximately 7 miles from downtown, with a very clear view on the day after a big storm, and my wife wondered if it was possible to tell which tower was the Millenium Tower from our perspective.

I suspect not.

But if we had a satellite...

Monday, November 28, 2016

Carlsen-Karjakin, game 12: draw (6-6)

And so, the match is complete, with a fairly bloodless 30-move draw in the final game.

As I understand it, there will be a tie-breaking session on Wednesday:

If the score is tied after 12 games, rapid chess (4 games), then blitz chess (five 2-game matches), and possibly an Armageddon game will be used until the tie is broken (Regulations 3.7)

Of course, nobody will be satisfied with this result; tie break procedures are never satisfying.

But, one way or another, on Wednesday we will know for sure.