Applying Foucault’s Archaeology of Knowledge to Google Analytics

street sign photo: rue foucault

 Introduction: A Brief Overview of Google Analytics

Google Analytics consists of two main components: Google-programmed Javascript code embedded on each page within a website “which collects and sends visitor activity to your Google Analytics account” (“How Analytics Impacts,” 2014) and the reporting mechanism connected to the Javascript code where visitor activity is collected and displayed at www.google.com/analytics. The data are sent to Google’s servers for storage via Internet, mediated by the networked hardware elements (switches, routers, fiber, etc.) of the Web.

A visit to a web page in which Google Analytics code is embedded activates the embedded snippet, generates data, and sends those data points to Analytics.

Code snippet sample (from spcs.richmond.edu)

<script type="text/javascript">
 var _gaq = _gaq || [];
 // Main Site Account
 _gaq.push(['_setAccount', 'UA-xxxxxxx-1']);
 _gaq.push(['_trackPageview']);
 // Legacy Account
 _gaq.push(['l1._setAccount', 'UA-xxxxxxx-2']);
 _gaq.push(['l1._trackPageview']);
 //rollup account 
 _gaq.push(['rup._setAccount', 'UA-xxxxxxx-1']);
 _gaq.push(['rup._setDomainName', 'richmond.edu']);
 _gaq.push(['rup._trackPageview']);

 (function() {
 var ga = document.createElement('script'); ga.type = 'text/javascript'; ga.async = true;
 ga.src = ('https:' == document.location.protocol ? 'https://ssl' : 'http://www') + '.google-analytics.com/ga.js';
 var s = document.getElementsByTagName('script')[0]; s.parentNode.insertBefore(ga, s);
 })();
 </script>

These data points include hundreds of characteristics of the visit, including page visited, time on site/time on page, referral sources, links selected to exit the site, and more. A visual representation of these data points is available below in Figure 1. All data points are recorded in Analytics at the instant of the visit (delayed or rerouted as needed by network hardware). The action of the visit generates these data points; once the visit has been recorded, there is no writing of data in Analytics until another set of data points is recorded via user interaction with content on the page.

Diagram

Figure 1: Visualizing a sample Google Analytics data set—Popplet 

An active user interaction with the web page itself is required for Google Analytics to register data points. An important distinction is that web crawlers, like Google crawlers, are not recorded as visits; only visitor interactions trigger a response from the embedded code snippet. There is a direct relationship between the data being collected in Analytics and a user’s interaction. However, the user does not enter these data in a conscious or meaningful way; they are simply collected and inscribed to Analytics in a transparent fashion.

Data collected in Google Analytics are aggregated and, for the end user, not individualized. Analytics’ data privacy and security notes that individual user data may be collected, but not provided to the end user (the Analytics account owner, manager, or specialist): “Google Analytics customers are prohibited from sending personally identifiable information to Google, but this principle might not apply in some instances in which Google Analytics is used to to analyze how Google products and services are used by signed in account holders” (“We Use Our Own Products,” 2014). The end user is unable to re-constitute personally identifying information about visitors from the data provided.

However, the aggregated data are able to describe a nuanced portrait of our website visitors, to the point that multiple aggregate profiles are created. The data can help us answer questions about our website, like how many users access the site using a mobile device or tablet or how many pages in the site an average user visits. The answers to these questions, in turn, generate action items to customize web page content to user technologies and patterns of behavior.

Relationship to Foucault

Foucault sought to avoid transcribing discourse within traditional unities like genre or oeuvre; instead, he sought dynamic dispersions to describe the sum of component parts brought together for a specific exigence. “The rules of formation are conditions of existence (but also of coexistence, maintenance, modification, and disappearance) in a given discursive division” (p. 38). In Google Analytics, aggregated data contribute to a discursive monument (p. 139) that describes visits to the website, ostensibly for a specific exigence (e.g. to learn more about the School of Professional and Continuing Studies degree programs: see Figure 2).

Web page screen capture

Figure 2: Sample University of Richmond School of Professional and Continuing Studies degree programs web page

These data, when examined by the end user, help determine whether content or information architecture of the website needs to be revised (e.g. visitors are spending less time on one program page than on another – does this suggest content is more or less compelling on one page than another? See Figure 3).

google analytics screen capture

Figure 3: Exigencies that arise from reviewing Analytics data: Should we revise content in the /hr-management/ folder because average time on page was so much less than /education/ during last month?

This process describes what I might consider a double exigence. On one hand, a visitor’s exigency inscribes visit data in Google Analytics; on the other hand, the end user reviews aggregated visit data to answer questions about the content and/or structure of the website.

Discursive Formation 

There is a single moment, one that is likely measured in milliseconds, even nanoseconds, in which the result of a user’s concrete interaction on a specific web page is the inscription of data on an encrypted Google server containing our Google Analytics account. This moment describes the discursive formation of a statement. Within Analytics, there is no way to have predicted that irruptive moment would occur, as the moment involved a single independent individual having a single, concrete, specific interaction with a specific web page. There is also no way to repeat that exact irruptive moment or that exact discursive enunciation. Even if the individual were to visit that page again within 30 days, the statement would be described in terms of a repeat rather than new visit, likely resulting from selecting a local browser bookmark or conducting a different search. Its existence as an Analytics artifact would therefore differ from previous recorded visits. The statement is not a structural unity; rather, it’s a function of the user’s instantiated interaction with a web page. “This is because it [a statement] is not itself a unit, but a function that cuts across a domain of structures and possible unities, and which reveals them, with concrete contents, in time and space” (Foucault 2010/1972, p. 87). In fact, I can see the monument of that moment in time and space in Analytics (see Figure 3, above, for a visualization of aggregated moments over a month).

Nodes in the Network

Google Analytics defines nodes in its network in terms of metrics and dimensions. Metrics are “quantitative measurements of users, sessions and actions” and dimensions are “characteristics of users, their sessions and actions” (Google Analytics Academy, 2013). In the Popplet Figure 1 (above), “Referring Source” describes metrics and “Visitor Info” describes dimensions. To generate any relationship among metrics and dimensions, a visitor actively engages with a web page that contains embedded code. The visitor to the page, in this case, would be Foucault’s subject. Foucault describes discourse as being formed in the differential relationship among speaker, site, and subject’s relationship to the object (p. 55). Within Analytics, we can see these elements working together to generate a statement. The creator(s) of the web page, both its content and its embedded Analytics code, and the host of the web page, in physical and virtual space, act together as speaker. The speaker presents the page in question (the object) to the subject. The site is described in several different ways as the visitor interacts with the page: site is captured in dimensions that define user characteristics like amount of time spent on a page, browser type, platform, time of day, IP address of the visiting computer’s physical network, approximate geographic location of the visitor’s browser, and more. The subject’s relationship to the page (object) is captured by metrics that measure activity, including referring source (the link clicked or URL entered to arrive at the website in question). Metrics and dimensions work together as discursive formation that is collected in Analytics. Without a differentiated relationship (in which the subject is entering URLs, selecting links, or some other positivistic action that generates browser activity), no discursive content is collected.

Definition

Google Analytics is a Foucauldian archive of networked discourse. The discursive formation occurs the moment a subject follows or enters a web link. The active interaction of subject, object, and speaker/author/creator generates a discursive statement. That statement’s networked archive is inscribed as an assemblage of data points. A summary of those data points—in relationship to one another as metrics and dimensions and in relationship to subject, object, speaker, goals, and events—appears below in Figure 4.

Popplet screen capture

Figure 4: Google Analytics as networked archive of a discursive statement—Popplet

Agency and Flow

Google Analytics nodes are metrics and dimensions. These nodes have no agency of themselves. They are created and inscribed in the moment of visiting a web page.

However, Analytics requires agency at higher levels of the network hierarchy, in the differentiated relationship among speaker (page author, coder, and host), site (metrics and dimensions), and subject (visitor) relationship to the object (web page). Among these nodes (which are tangentially part of the Analytics network because the object contains the embedded Analytics code snippet), the subject is the agent that creates and sustains the network. As the result of a concrete action on a tracked web page, visit data are generated by the embedded code snippet and transmitted, via network hardware, to Google servers. At the same moment, a separate snippet of code is written to (or updated on) the subject’s browser cache (a cookie) that assists the tracking snippet in determining whether the visitor is new or returning to the page. User agency can erase the cookie, which may the dimension of new or returning visitor, and the user can determine whether to follow links, stay on the page, or follow an embedded event (like watching a video or reviewing a news feed). Agency and flow are largely “single bus” activities—they travel from the visitor to the Google server, but not directly back to the visitor. Some indirect agency can be found in the speaker (author and coder) in that results of metrics and dimensions analyses may include changes to web pages that become new again to the subject (visitor).

The Archive and the Archaeologist

As the person who has been granted administrative authority by our central website authority (Director of Web Services) to interact with data in Google Analytics, I have access to a vast (albeit potentially incomplete, given Google’s ownership of the archive itself) portion of the archive of discourse. Foucault describes an archive as the collection of discursive formations, a finite collection that does not point to some transcendent future or some ideal meaning. “The never completed, never wholly achieved uncovering of the archive forms the general horizon to which the description of discursive formations, the analysis of positivities, the mapping of the enunciative field belong” (p. 131). Analytics does not ascribe meaning to the discursive moment itself. Rather, it records the irruptive actions of the discursive formation as a collection of statements in an archive. As an administrative user, I can access that archive and recreate a visualization of discursive moments that occurred. They are inscribed in the metrics and dimensions recorded at the irruptive moment. At best, I can “dig into” the archived results to determine patterns of activity (metrics) and characteristics (dimensions). I and other users with access to some or all aspects of the Analytics account are archaeologists plumbing the depths of the archive.

Google Analytics visualizes flow by archiving the actions that generated flow, but Analytics data themselves are not in flow. They’re an archive of data generated via discourse. For lack of a better analogy, GA is a chapter book I can read that contains archived evidence of discourse. Those traces represent, but are not themselves, the discursive formations of statements.

Conclusions

Google Analytics is a networked archive and an archived network.

Networked archive: The archive is networked in that it collects interrelated data points and demonstrates the relationship among those data points using visualizations and aggregated data. Those relationships can be explored by someone with user access to the Google Analytics account. In this networked instant, my role as archive archaeologist activates the network, which otherwise represents little more than a collection of data points that, at the moment of web browsing, represented active discourse.

Archived network: The network is archived in that Google Analytics collects the network activity of subjects, objects, and creators/speakers—their discourses. The subject’s interaction with a web page results in discursive formation of statements; a sample statement is visualized in Figure 4 (above). A collection of such statements from a single subject is aggregated as a user session, which I would consider Foucault’s concept of a monument. A collection of those user sessions (monuments) in aggregate is the archive, and that’s what Analytics gives access to.

Note

Original snippet: […they are simply collected and inscribed to Analytics in a transparent fashion…]

“Transparent” probably isn’t the right term. If you’ve ever seen a page load delayed by a message at the bottom of the browser window that says something like “Loading analytics.google.com/ga.js,” you’ve encountered the code snippet at work, struggling through network latency to load the data to Google’s servers. [return]

References

Foucault, M. (2010). The archaeology of knowledge and the discourse on language. (A. M. Sheridan Smith, Trans.). New York, NY: Vintage Books. (Original work published in 1972)

Google Analytics Academy. (2013, October). Key metrics and dimensions defined [Video transcript]. Digital Analytics Fundamentals. Retrieved from https://analyticsacademy.withgoogle.com/assets/pdf/DigitalAnalyticsFundamentals-Lesson3.2KeymetricsanddimensionsdefinedText.pdf

How Analytics impacts your website code. (2014). Retrieved 2014, 10 February from https://support.google.com/analytics/answer/1008009?hl=en

We use our own products. (2014). Retrieved 2014, 10 February from https://support.google.com/analytics/answer/3000986?hl=en&ref_topic=2919631

[“Rue Foucault”: Creative Commons licensed image by Flickr user sarahstarkweather]

5 thoughts on “Applying Foucault’s Archaeology of Knowledge to Google Analytics

  1. I’ll start with your quote from Foucault – “The rules of formation are conditions of existence (but also of coexistence, maintenance, modification, and disappearance) in a given discursive division” (p. 38).

    This phrase, “conditions of existence,” seems ideally suited to a discussion of the inner workings of a website, a “text” or locus of activity that for many readers conceals such rules and conditions. This reminded me of Foucault’s concept of trace, which I seem to be returning to again and again. Your case study really highlights the importance of what I’ll call “trace awareness” for those of us engaging in rhetorical analysis (functional or theoretical). Further, because your case study hinges on data, as well as data flow and data nodes and data direction/dispersion, it also adds additional clarity to the connections I was trying to make by applying hardware theory to my first case study of MOOCs. Further, your discussion of exigency and how users’ “exigency inscribes … data” on a space, and thus contributes to its creation and processing activities struck me as an area of thinking I’ll need to pursue more closely as I consider the arena of MOOCs. I tend to think of this moment as a node, and your commentary on this moment as a “statement” helped me make a connection to Foucault with nodes or hubs I had not previously seen so clearly.

    Two questions came to mind while reading your post. The first concerns the notion that the speaker’s identity / identification does not include the web user as well. Given our recent readings about activity theory (and CHAT), as well as cultural-historical approaches to rhetoric, I wonder whether the users/visitors to the web site shouldn’t be folded into this in some way, complicating as they do, through their localized / motive-driven choices of how they move through the page, ways they might eventually affect the way the page is modified at later dates. How do users play into this notion of site-user driven revisions?

    Also, given our activity theory readings, would your Foucauldian application to the agency status of metrics and dimensions nodes allow us to consider these nodes to have “potential energy” or “potential agency”?

    However, those questions aside, I very much appreciate the way you’ve captured a complicated subject (the minutiae of coding has always troubled me) using Foucault, analogy, and Popplet visuals. Very helpful!

    • Thanks for your insightful response! I think Google Analytics, or digital analytics in general, will become an important node in your analysis of MOOCs, as metrics are vital to assessing the success of any such digital effort. While MOOC designers are certainly seeking ways to measure student learning as a result of participating in a MOOC, the specific measurement rubrics will likely involve metrics and dimensions — user behaviors measured and tracked numerically. For example, a MOOC designer might measure the number of students who successfully downloaded a specific assignment sheet relative to the number who successfully uploaded a completed assignment to help measure the engagement level of students in the course. The number of accurately completed assignments could then be compared to the total number of completed assignments submitted to evaluate a level of success at the task being assessed. All of these measurements will be tracked in an interface and system that resembles (or actually is) Google Analytics.

      Something I’ve noticed in completing this assignment — and which Leslie articulated perfectly at the end of her case study — is that our analysis requires us to limit ourselves to discussing a relatively singular aspect of the OoS. In my case, I ended up focusing almost entirely on metrics and dimensions — the actual visit data and what it reflects about user behaviors — rather than on the process of a visitor browsing a web page or the Analytics end user (yours truly) engaging with the data. Both processes are related to Google Analytics by clear and definitive network connections, namely the Google Analytics code snippet (in the case of the visitor) and the Google Analytics web interface (in the case of the end user). I omitted both of these aspects to focus attention on the data itself, but I’m with you on the way web visitors have nearly unlimited agency in creating their own discursive path through a website. That path is traceable in aggregate, and that path offers remarkable insight into the success of the website in achieving its goal(s). User experience (UX) design is about creating a least obtrusively mediated discursive path through the site, regardless of the discursive path taken through the site. And UX is built largely on Analytics results. The same is likely true for the UX of MOOCs. As I develop my theory of networks, the building of the network system, or system of networks, will become a major factor. Isolating a single network simply doesn’t happen in the 21st century — all networks are part of network systems, and there is some question (captured humorously in the Kevin Bacon six degrees of separate game) as to whether there really is such a thing as an individuated network system. There are closed systems, but “closed” doesn’t necessarily mean “separated.” I’m not sure where to take this concept yet, but interconnected systems is becoming my mantra.

  2. Pingback: Case Study Gumbo: Responses | Digital Rhetor: A Research Space

  3. Daniel, I very much appreciated the way you used Foucault’s theory of discursive formation to explain how Google Analytics operates within a discursive formation. I’m not that familiar with Google Analytics, so I found the thorough explanation of Google Analytics in the beginning to be very helpful.

    When describing the discursive formation, you did a very good job of describing the way in which Google Analytics represents only a fragment of time within the network. I found so much of your case study valuable to thinking about my own OoS since I also used Foucault in my analysis. You did a great job of describing this, and as I was reading, I realized that I could have done a much better job of describing my map of the Le Leche League network as extremely fluid and every changing. It was definitely food for thought as I continue thinking about my own OoS. You said, “The statement is not a structural unity; rather, it’s a function of the user’s instantiated interaction with a webpage”. This is another place where your insight shed light on my own OoS. In my OoS, this moment is perhaps instead the particular and unique interactions of the mothers with each other and the group leader.

    There were a couple of points of confusion for me, particularly since Foucault is such a dense read and I was still working through the theory when I posted my case study. In this discursive formation, you identified the object as the particular page being visited by the visitor, the subject, and the “subject’s relationship to the page (object) is captured by metrics that measure activity”. Is it possible that the metrics themselves become an object? When the Google Analytics user examines the data, has the data become the object? Is something else the object when this occurs? Or is it perhaps not the page being used that is the object but the use of the page by the subject that is the object?

    Another question that came to mind involves the flow of the discursive formation. According to your analysis of flow, “Analytics data themselves are not in flow.” Does this mean that the discursive activity stops at the level of the user’s interaction with the web page? When the Analytics user examines the data collected and translated by Google Analytics, and then make decisions based on that data, is that data still not part of the flow? Is Google Analytics simply an archive outside of the discursive formation if the page (which you identified as the object) may be altered depending on the data contained in the analytics, or has it become an important part of the discursive formation? It seems to me that this data is different from archival information that is studied simply as history, but if the archival data is used to make changes that impact the future of experiences with the object, maybe the data is part of the flow within the discourse. If, as you say, it is simply a “archived network” it may exist outside of the flow, but it seems like calling it a “networked archive” might mean that it is ingratiated into the flow of discourse. As I said, though, I am still trying to wrap my head around Foucault and I may not fully understand the nuances of all of these concepts. I’d like to see a bit more elaboration, under you discussion of networked archives, about what happens when you, as the archive archaeologist, activate the network. What happens to the discursive formation when the network is activated here? Does the information flow back? Is flow as much backward (from the archive) as it is forward (to the archive)?

    I’m so thankful that you used Foucault in your analysis because it opened my eyes to a new way of understanding Foucault, gave me an opportunity to reflect on Foucault, to examine my understanding of the theory, and also to reflect on my own use of Foucault in my case study. I also very much appreciated your discussion of the archive as part of the network. Very fascinating stuff.

  4. Pingback: Case Study #1 Peer Reviews | jennysmoore

Leave a Reply to Jenny Moore Cancel reply

Your email address will not be published. Required fields are marked *