Page Features and Segmentation Update
Before I return once more to my HT image dataset for course projects this semester, I want to collect and update what I know about (1) locating pages with illustrations and (2) extracting those illustrations. I also introduce a new experiment: can machine learning techniques be used to discover the circulation of stereotyped plates? In contrast with (2), this type of search would seem to involve whole pages. I will leave to another post the prospects for adding title page and bibliographic (meta)data in the hunt for duplicate and pirated print objects.