Representation Learning without Representationalism (in progress)
Both machine learning research and philosophy of model-based science appeal to representations. Given this shared emphasis, it seems natural to conclude that we should understand the epistemic worth of deep learning models in science in terms of their capacity to represent the world. I argue that this is a mistake. I distinguish the prevailing notion of representation used in machine learning research from the concept of scientific representation that figures in the philosophy of model-based science. I argue that the former cannot do the work of the latter in model-based inferences. I then defend an artifactualist approach to machine learning models in science, which aims to understand their epistemic worth in terms of the material practices of constructing and applying them. Machine learning models are instruments that we use to facilitate our epistemic activities in science. They do so without scientific representation.
Automating Exploration: Machine Learning, Understanding, and the Aims of Data-Driven Science (in progress)
In this paper, I draw a parallel between data-driven science and exploratory experiments which sheds light on the generative aims of data-intensive methodologies. Exploratory is thought to provide the conditions for the development of new scientific concepts. I then go on to bring attention to an under-theorized problem for the application of machine learning (ML) algorithms in service of these aims. I call this problem semantic opacity. Semantic opacity occurs when the knowledge needed to translate the output of an ML system into scientific concepts depends on theoretical assumptions about the same domain of inquiry into which the model purports to grant insight. Semantic opacity is especially likely to occur in exploratory contexts, wherein experimentation is not strongly guided by extant theory. However, when exploratory methods are mediated by ML, we lack the interpretative tools needed to decide if the predictions of a model correspond to robust, scientific kinds rather than jerry-rigged ones tied to spurious correlations in Big Data. Furthermore, I argue that techniques in explainable AI (XAI) that aim to make these models more interpretable are not well suited to address semantic opacity.
Artiface of Objectivity: why algorithms are necessarily value-laden (in progress)
Algorithmic decision-making systems applied in social contexts drape value-laden solutions in an illusory veil of objectivity. I argue that these systems are necessarily value-laden and that this follows from the need to construct a quantifiable objective function. Many researchers have convincingly argued that machine learning systems learn to replicate and amplify pre-existing biases of moral import found in training data. But these arguments permit a strategic retreat for those who nevertheless maintain that algorithms themselves are value-neutral. Proponents of the value-neutrality of algorithms argue that while the existence of algorithmic bias is undeniable such bias is merely the product of bad data curation practices. On such a view, eliminating biased data would obliterate any values embedded in algorithmic decision-making. This position can be neatly summarized by the slogan “Algorithms aren’t biased, data is biased.” However, this attitude towards algorithms is misguided. Training machine learning algorithms involves optimization, which requires either minimizing an error function or maximizing an objective function by iteratively adjusting a model’s parameters. The objective function represents the quality of the solution found by the algorithm as a single real number. Training an algorithm thus aggregates countless indicators of predictive success into a single, automatically generated, weighted index. But deciding to operationalize a particular goal in this way is itself a value-laden choice. This is because many qualities we want to predict are qualitative concepts with multifaceted meanings. Such concepts like “health” or “job-applicant-quality” lack sharp boundaries and admit plural and context-dependent meanings. Collapsing concepts into a quantifiable ratio scale of predictive success flattens out their quality dimensions. This process is often underdetermined and arbitrary, but convenient for enterprises that rely on precise and unambiguous predictions. Hence, the very choice to use an algorithm in the first place reflects the values and priorities of particular stakeholders.
Getting Real About Neural Data Science (in progress)
Recent research suggests that manifolds play an important role in neural computations. These manifolds are continuous, low-dimensional structures embedded in high-dimensional neural activity. Investigators purport to uncover these structures by using data-analytic techniques to reduce the dimensionality of patterns of neural activity and subsequently reveal the underlying dynamics that are functionally relevant to a specific task. However, the practice of uncovering low-dimensional structures with dimensionality reduction involves modeling choices that introduce a range of implicit assumptions which threaten to cloud our analysis. Yet, the theoretical importance of these modeling choices are rarely discusses by the modellers themselves. To what extent do these techniques license realist claims about the existence of neural manifolds and their role in representing and computing information in the brain? I argue that low-dimensional structures uncovered through data-analysis are akin to rational reconstructions of the underlying dimensionality on the basis of what is empirically accessible to us. These structures reflect only our analyses and are strictly speaking non-factive. However, we can evaluate competing analyses and adjudicate between cases of underdetermination by asking which analysis best reconstructs our best theoretical hypotheses concerning the neural task under investigation. This approach, however, demands a much more robust role for theoretical work concerning the nature and structure of neural computations.