Alessandro Graheli's Web Log

Thursday, November 19, 2009

Cladistics and Stemmatics

I am presently fully engaged in the collation work of the Nyāyamañjarī (sixth āhnika). So far I collated a portion of NM6 (about one-tenth of the āhnika) using all the available sources, and traced a preliminary stemma For this purpose I used Cladistics software (Paups and McClade), and found it is a useful tool, especially to save time in statistic examination of the data and in checking possible combinations with a good graphic interface.

The tree produced by the software, however, needs to be elaborated into a real genealogic stemma; the software basically is focused on showing similarities and dissimalarities between the “taxa” (i.e. the mss) so it shows them all at the bottom of the tree, reconstructing possible branchings and nodes (i.e. archetypes) that have lead to their evolution. Each textual variant is read by the software as a “character”, and different readings of the same variants in different mss are read as “states” of the same “character” (for example the dimension (state) of the tail (character) in different stages of evolution from ape to human being). In the example here, the tree shows the positions of the taxa (mss) in relation to the character (variant) “bhotsyase” and its various states (bhotsyase, votsusa etc.).

A major problem of the cladistic software is that it reasons in bipartitic terms, that is: from a node it always generates two branches, although in the real world of textual transmission from a single archetype any number of copies may have been produced. This is a serious issue in stemmatics that needs to be sorted out.

In my present stemma, the Allahabad (Ad) and Mysore (Md) mss are likely to be one the transcript of the other (I am not yet sure which one of the two is the apograph). Although it is early to give other verdicts, I am under the impression that the Calicut (Cm in my sigla) and the Pune Bühler (PBs) mss preserve important readings not found elsewhere. Also, I did not find evidence of contamination between sources, except for the Sanskrit College ms (VCd) which has marginalia clearly coming from horizontal tradition. The position of the Deccan College (PDd), Srinagar (Ss) and Kolkata (Kd) mss in the tree is still particularly problematic.

It seems overall a good situation towards a feasible stemma.

4 Comments:

elisa freschi said...: thanks a lot for this clear explanation. one I have been looking forward to read since a long time. Could you please add some words about the stemma of bhotsyase?
Kd and PDd, both reading "votsyase" if I am not wrong, are not considered together, I guess, because there is no "together" for taxonomy and, hence, for cladistics. So, cladistics is at loss in case of two identical manuscripts (if there are!). What about the case of a deduced manuscript adding many mistakes to the one it is copying? If such mistakes are not shared by any other manuscript they are not taken into account, so the two manuscripts would be the same. What does the software do in such cases?

More in general: does not the software run the risk to badly evaluate variants on the basis of the Roman (not Indian) alphabet? For instance, va and ba are almost no variant in India, but a major one in the Roman alphabet, and ba is not a bha "without something". What do you do to minimise these problems?; November 19, 2009 at 9:36 PM
Alessandro said...: Kd and Pd are considered by the software as deriving from the same ancestor, and thus the branches leading to them are coloured yellow. What do you mean by identical mss? This is a logical impossibility, as far as I know.

An example of the scenario you are referring to is given by Ad and MyOd. Ad is probably a descriptus of MyOd. I know this from other common readings, not reproduced here. The software shows their proximity by signalling a common ancestor to them. One needs then to trace a real stemma showing their direct genealogy, and perhaps discarding Ad altogether.

As for the Roman or Indian script, the reaction of the software depends on the input, so one needs to take great care in the input of the variants, their segmentation etc. In my case, so far I am entering variants diplomatically, so va and ba indeed result as states of the character. But when evaluating each character and its tree, and eventually manipulating it to sort out contrasting branchings, one obviously ignores, and can tell the software to ignore, va/ba and such variants.

Related to this, there is no cladistic software, in my knowledge, which could work with devanagari fonts.

Basically one needs to analyse each tree (for every variant, i.e. character, a tree is produced by the software) and evaluate, modify, or even ignore it when it is not meaningful for stemmatic purpose, as in the case of individual variants or random elongation of vowels etc.; November 20, 2009 at 8:28 PM
elisa freschi said...: Can one modify the output tree? If so, does this affect the software's evaluation of the whole mass of data? That is, while modifying the tree, does one "teach" the system that this is the correct one and that data have to be considered accordingly, or does one only modify a picture?; November 23, 2009 at 10:03 PM
Alessandro said...: The user can decide the level of intervention. One can modify just the output tree, or intervene at deeper levels of the data, as he chooses. It is a rather powerful software, difficult to learn all its subtleties though.; November 25, 2009 at 1:33 PM

Scraps and Drafts

Thursday, November 19, 2009

Cladistics and Stemmatics

4 Comments:

Previous Posts