Thursday

"The Butler did it": Forbrydelsen and the problem of depth-first search in a data-rich world

(going to try and put proper caps in here)
(plus, SPOILERS!)

So Forbrydelsen (the killing) is over, and guess what? The butler did it.



Well almost; Vagn, the best friend, the persistent companion, handyman, friend to the children, in every episode, always hanging around, never suspected (until the twisty ending)... it couldn't be him, could it? Why, he had an alibi, and no motive*, and didn't fit the psychological profile. Never mind all that - looking at the structure of the story, he's the only person it could have been**. This hurts, actually; for its 20 episodes, forbrydelsen was stuffed full of believable, well-drawn characters, performed sensitively; and they sacrificed them for the sake of a hack ending that jarred. I spotted it was going to be Vagn around episode 10, based soley on the fact that he was the only character to be always there, and my mum claims an impressive episode 2 (she's read a lot of detective books). You could tell it was going to be him, not because of how he acted, but because of his position in the structure and narrative, and that is annoying.

I don't think there's been such a good dissection of life for a grieving couple since 'don't look now', even if the stages of grief seemed to be compressed into three convenient weeks. This is at odds with the contrived structuring of the program, where the car she had been found in appears to have been driven by no less than four people that night, and everybody has a secret alibi, rather than a motive, as would be the case in a classic poirot investigation. Interestingly, "In Germany they liked it as a whodunnit, but you guys seem to be more into the characters."

Now that that's off my chest, I want to talk about something else: the methods of the detectives. not from a police procedure point of view, but in terms of data search.

In search analysis, there are two basic methods: breadth-first and depth-first. A problem space is a network; each node is a piece of evidence and you're looking for the node that gives you the critical evidence that links a particular person to your starting node, which is 'there has been a murder'. You don't know where this node is, so you need to explore the problem space. So you have a choice of how you go about exploring it.

A depth-first approach says
1: Take the first unexplored node attached to your current node.
2: Check if it's the end state.
3: if yes: exit with success! otherwise, goto 1.
4: if there are no other nodes left to explore: backtrack to the previous node and goto 1.

In this diagram, the red lines show which nodes are connected and the black arrows show what order the nodes are investigated.


This recursive algorithm produces a burrowing action that drills down as fast as it can to the bottom of the data. It's unlikely to produce a correct answer straight away (except by luck), but produces more interesting television; it's the technique used by Lund and Meyer on Forbrydelsen. They picked a suspect, assumed they did it, harangued them to death, and ultimately found out the were innocent. And OK, I think the detectives generally handled themselves pretty well given the absolute bat-shit-mad number of people who didn't come forward with vital information because they had something really really important to hide. Then again, the convoluted plot somersaults need to keep the series going for 20 episodes are pretty shameful. For just one example - why did the initial driver of the car not know that it had been picked up when he went home ill? why did the security guard not come forward with blatantly vital information, which would have cut out three suspects (by my count) and about 8 days of police work.

And worse: why did Meyer cryptically say 'Sara 84' on his death bed, rather than 'Vagn shot me'? he knows what vagn looks like and the implication that vagn goes round with his face covered is just to silly contemplate.

Also Vagn - if you're going to steal a vitally important photo from a photo album, don't leave the page saying which photo has been taken in the book: Take the whole book. It's quicker and easier and it's what you would have done without even thinking about it.

Ok, so that was more than one example but so much of the plot was too silly.

So I'll move onto breadth-first search, and in doing so, talk about why I think it's relevant. Breadth-first involves checking nodes for success depending on how close they are to your start node, as illustrated here:



So notice that in this case, the search procedes layer-by-layer.

I like to think about these searches as 'lines of inquiry'. From your starting point, you have several suspects, and as you open up these lines of inquiry, you have different branching facts of evidence, motive, alibis and inconsistencies, &c.

Take the classic Poirot scene: a lounge, with all the suspects gathered, a few days after the event, as Poirot reveals the sum of his findings. The detective has spent days combing the area and suspects, accumulating evidence, and let the facts speak for themselves. With a little creative thinking, he is able to read between the facts, and given a room full of people who all have motives and windows of opportunity, is able to pick out the murderer. He is a breadth-first searcher; he does not accuse anyone until all the facts are out.

I read PD James' "a mind to murder" not long ago, and it's a great example of breadth-first detecting: virtually the first half of the book is police interviews with occupants of the house where the murder took place. That's how police work should be done, but to be fair, I didn't enjoy the book. It was too long before any actual, you know, detecting went on.

This is where the 'hunch' comes in and rescues us from overly-procedural police work. once the basic facts of the scene are established, the detective goes off on their own, follows their hunch, maybe makes a few illegal short-cuts and compromising mistakes. I didn't get the sense of any hunches in Forbrydelsen; it was all or nothing.

In the diagrams above, if the solution to the murder was at node 10, both methods would be as inefficient as each other. But in Forbrydelsen, it was at node 3: the best friend (in this analogy, 1 is Nana, 2 is Theis). And it took the detectives over two weeks to even question him. In fact, in every case, they only took people in for questioning when they suspected them - and this is why I felt, in 2011, let down by the series (which was, in its defence, made in 2007).

We live in a data rich age. The Petabyte Age, to quote Wired. They talk about the end of theory: living in a time when you don't need models and theories because we have so much information. Who needs a theory of evolution when we have so much data on so many different species? The argument goes, we don't need a model, we have the real thing. The first task in police work should be to accumulate as much evidence as possible - regardless of meaning. Once you've got everything laid out in front of you, the suspect should simply emerge.

The point is, detectives have two resources: facts and hunches. The two build on each other, with facts leading to hunches and vice versa; you start with facts, you need to end with facts, but along the way you can develop a hunch to get you through the dry times and give you some direction - called a heuristic in the search lexicon. Too much fact and not enough hunch, and you end up in staid 'a mind to murder' territory. Too much hunch and not enough fact, and you get Armstrong & Miller's 'Force on the Case', where a raving alcoholic ex-policeman repeatedly accuses his book-shop rival of being a murderer.


But in the Petabyte age, we don't need hunches or theories. we just need to accumulate data. That's what I think is sad about forbrydelsen: they don't use facts or hunches. They just blindly follow and exhaust linear enquiries, without considering the whole picture. It's one thing to be overly-procedural, and it's one thing to be a loose cannon; but the detectives in Forbrydelsen are neither, with a paucity of data and no hunch to give them direction. Their only strategy is pure depth-first search.

So the problems with Forbrydelsen are threefold:

1: the answer felt cheap (like they clumsily lumped together a crime of passion carried out by a serial killer),
2: there was the fact that anyone of about 10 people should have come forward with information, but had something else to hide (which is pushing it, and just for the sake of throwing in red herrings and prolonging the series), and
3: the detectives just didn't seem very good at their job - relying on too little data and accusing people to readily. But then, I suppose this follows from point two, as I said earlier, what could they do, as rational actors with a plot this convoluted?

So I'd like to see a 21st century, web 2.0 detective, totally data-led in his methodology. In short: Ben Goldacre PI.


*Yes, he had motives to stop Nana leaving the country, but not a motive to ritually rape and murder her. The killer was described as methodical and not impulsive, while Vagn's actions are totally off the cuff and unplanned.

**The only other person it could have been was Morten, for the same reason of being a 'big bad friend' of another protaganist. It turned out he had his own twist.
Post a Comment