Given that and the fact that it will ultimately be used all over the place, and that space is relatively cheap.I would suggest creating an index using Lucene (or Lucene.NET) depending on your language preference. If you think about it, doing this in SQL could be very processor intensive. Note that the question was about an optimal solution, regardless of how LinkedIn actually does it today, which I looked up after I wrote my own answer above. ![]() What's yours? If you want extra challenge, try simulating an inteview situation (can't look up solutions on the Web). īut I'm sure there are better answers to this. 100 connections per member x 4 bytes per member ID = refactor into a batch implementation ("look up distance from me to N different users") so you can get all the remote results from step #3 without having to make up to N remote calls. This probably means a dedicated cluster of lots-of-RAM servers which can cache the entire network's 1st-level connections in memory. Later, as I thought about the problem without the pressure of an interview hanging over my head, I came up a more reasonable answer.īuild a very fast way to get the first-level connections for each of batch of user IDs (batch size up to ~1000?). I wouldn't have hired myself after an answer like that! ![]() īut when trying to convert this insight into a solution, I came up with a bumbling answer involving creating persistent caches of 2nd-level connections of everyone on the site (which would have been hugely epensive in perf and complex to maintain), and I took an inexplicable detour into using Bloom Filters in an way that made little technical sense. I also guessed that the partial result was likely to be my second-level connections, because "cache all 3rd-level connections" would be too costly in RAM and CPU. 20x+ on a single page, 100's per login session), so you can do part of the "distance of me to X", cache it, and then re-use that cached partial result many times in order to make other operations much cheaper. ![]() I got the essential "trick" of the solution: finding "distance from me" is a common operation (e.g. in people search results, list of people working in a company, etc.)? I recently botched a job interview by poorly answering a straightforward question: how do sites like LinkedIn efficiently show the relationship distance (1st/2nd/3rd) from you to every person displayed on a page (e.g.
0 Comments
Leave a Reply. |