Wednesday, 1 February 2012

Extract Data from your LinkedIn Network

Some data extraction experiments with LinkedIn reveal the vast majority of my network is not very active.

As part of this investigation, I wrote some code to retrieve and process network updates from the LinkedIn database, which can be retrieved in totality or by type. I concentrated on "shares" as these are a good indicator of someone's activity - the other options I'll investigate later.

I chose to write the results to CSV files - but an obvious improvement is to write them to a database - and then imported into OpenOffice.org to produce the following summary graphic:

Only 9% of my network shared an update in January 2012, something which we might regard as "active", as opposed to adding new connections, which we might regard as "passive". A breakdown of those 9% shows that as ever, there are a small number of "leaders" who are regularly active, and quite a large "tail" who update very infrequently.

Which of these approaches stands more chance of reaching out to your existing and potential customers within your network?

Technical Notes

The LinkedIn API restricts the volume of information returned and the number of calls which can be made for performance reasons. The next step is to read more data, but more frequently, and accumulate it for further analysis.

I used LinkedIn-J, a Java library for accessing the LinkedIn API, to extract data from my LinkedIn network. I won't cover the specifics of using the API, as these are covered much better elsewhere but the basic steps are as follows:

  • Register your application with LinkedIn, and get your API keys
  • Log in to LinkedIn and obtain authentication tokens for your application to allow it to access your data
  • Write some Java code to call the API, read and process the results

Privacy Options

Note you will be unable to retrieve full details for any of your connections who have elected not to share  their data with third party applications.

No comments:

Post a Comment