I've been reading some Don Swanson papers recently in an attempt to:
- Approach information retrieval (IR) from a philosophical perspective, as IR is a major part of what I do now but is far too broad a field to easily comprehend without years of experience
- Gain a historical perspective
- Remember material from my undergraduate Philosophy of Mind course (something about P-zombies...or I guess that was something else entirely)
My progress on those fronts continues, but in the meantime, I noticed an interesting point in the 1986 paper, "Undiscovered Public Knowledge":
OK, so that's less of a point and more of a critical element of searching information: we can never know everything because we can never search everything. Swanson is specifically discussing scientific literature here, but even if he wasn't, do we now have access to technology rendering that issue somewhat less of a concern? We can't search everything, but between heavily optimized database structures, carefully engineered indexing schemes, and deep learning approaches (though I'd rather avoid seeing any type of machine learning as a universal, hammers-and-nails solution) can't we get very close?
At the very least, focusing on scientific literature alone, the modern issue becomes less of how rapidly new information becomes available as much as how rapidly it is lost. I'd suspect that this is more of a problem for supplementary data than for manuscripts; data tables are much more difficult to index and are essentially useless without documentation, so every data set available only in a single supplementary Excel spreadsheet has the potential to be "lost" data. I'm curious about how much of this information disappears every day, like melting glaciers or permafrost, never to be seen again, except perhaps with luck or coincidence (for the scientific data, at least - that probably won't work for the ice).