Web science: what and how?

From the article “Web Science: An Interdisciplinary Approach to Understanding the Web” in the July issue CACM (which, by the way, looks quite impressive after the editorial overhaul!):

At the micro scale, the Web is an infrastructure of artificial languages and protocols; it is a piece of engineering. […] The macro system, that is, the use of the micro system by many users interacting with one another in often-unpredicted ways, is far more interesting in and of itself and generally must be analyzed in ways that are different from the micro system. […] The essence of our understanding of what succeeds on the Web and how to develop better Web applications is that we must create new ways to understand how to design systems to produce the effect we want.  The best we can do today is design and build in the micro, hoping for the best, but how do we know if we’ve built in the right functionality to ensure the desired macroscale effects? How do we predict other side effects and the emergent properties of the macro? […] Given the breadth of the Web and its inherently multi-user (social) nature, its science is necessarily interdisciplinary, involving at least mathematics, CS, artificial intelligence, sociology, psychology, biology and economics.

This is a noble goal indeed. The Wikipedia article on sociology sounds quite similar on many aspects:

Sociologists research macro-structures and processes that organize or affect society […] And, they research micro-processes […] Sociologists often use  quantitative methods—such as social statistics or network analysis—to investigate the structure of a social process or describe patterns in social relationships. Sociologists also often use qualitative methods—such as focused interviews, group discussions and ethnographic methods—to investigate social processes.

First, we have to keep in mind that the current Western notion of “science” is fairly recent.  Furthermore, it has not always been the case that technology follows science. As an example, in the book “A People’s History of Science” by Clifford Conner, one can find the following quotation from Gallileo’s Two New Sciences, about Venice’s weapons factory (the Arsenal):

Indeed, I myself, being curious by nature, frequently visit this place for the mere pleasure of observing the work of those who, on account of their superiority over other artisans, we call “first rank men.” Conference with them has often helped me in the investigation of certain effects, including not only those which are striking, but also those which are recondite and almost incredible.

Later on, Conner says (p.284), quoting again Gallileo himself from the same source:

[Gallileo] demonstrated mathematically that “if projectiles are fired … all having the same speed, but each having a different elevation, the maximum range … will be obtained when the elevation is 45°: the other shots, fired at angles greater or less will have a shorter range. But in recounting how he arrived at that conclusion, he revealed that his initial inspiration came from discussions at the Arsenal: “From accounts given by gunners, I was already aware of the fact that in the use of cannons and mortars, the maximum range, that is the one in which the shot goes the farthest, is obtained when the elevation is 45°.” Although Gallileo’s mathematical analysis of the problem was a valuable original contribution, it did not tell workers at the Arsenal anything htey had not previously learned by empirical tests, and had little effect on the practical art of gunnery.

In any case, facilitating “technology” or “engineering” is certainly not the only good reason to pursue scientific knowledge. Conversely, although “pure science” certainly has an important role, it is not the only ingredient of technological progress (something I’ve alluded to in a previous post about, essentially, the venture capital approach to research).  Furthermore, some partly misguided opinions about the future of science have brightly shot through the journalistic sphere.

However, if, for whatever reason, we decide to go the way of science (a worthy pursuit), then I am reminded of the following interview of Richard Feynman by the BBC in 1981 (full programme):

Privacy concerns notwithstanding, the web gives us unprecedented opportunities to collect measurements in quantities and levels of detail that simply were not possible when the venerable state-of-the-art involved, e.g., passing around written notes among a few people. So, perhaps we can now check hypotheses more vigorously and eventually formulate universal laws (in the sense of physics).  Perhaps the web will allow us to prove Feynman wrong.

I’m not entirely convinced that it is possible to get quantitative causal models (aka. laws) of this sort. But if it is, then we need an appropriate experimental apparatus for large-scale data analysis to test hypotheses—what would be, say, the LHC-equivalent for web science?  (Because, pure science seems to have an increasing need for powerful apparatuses.) I’ll write some initial thoughts and early observations on this in another post.

I’m pretty sure that my recent posts have been doing circles around something, but I’m not quite sure yet what that is.  In any case, all this seems an interesting direction worth pursuing.  Even though Feynman was sometimes a very hard critic, we should pehaps remember his words along the way.