Object models and data sets for a social network .. crawlers vs simulation ..

This relates to my PhD – I wonder how people do this?(or if you encounter similar problems)

To simulate a social network, we need datasets

These are not easy to get

There are some crawlers. They have limitations

One option I am considering is to generate data for a social network based on a set of parameters

For this, we need a data model/object model of a social network

for ex:

facebook has profiles, relations(friends), posts(wall), groups, applications and so on.

ultimately I am thinking of the characteristics of social data and then considering how closely we can mirror generated data to real life

For this, we need to know the distribution i.e. the typical number of friends, blog posts etc etc for social networks. This link gives some info – but not a lot.

Any thoughts on

a) Existance of a data model

b) Your approach to simulation / data sets

c) crawlers v.s. data generation

d) data distribution

Apparantly, this is also a big topic for sigcomm this year!

kind rgds