Starting with tag: [TAG 0.5 Ketil Malde **20110126154402 Ignore-this: 159fcdd04188e296ec38fdc6f6307056 ] [bump to 0.5.1 Ketil Malde **20110126154415 Ignore-this: 7c8c40d13556f67b160e8115363035c7 ] [Add flowclip to examples. Ketil Malde **20110126154429 Ignore-this: e5b2e43aac41c176455c2606a5fc1b11 ] [trimFromTo now leaves flowgram untouched! Ketil Malde **20110126154455 Ignore-this: d826fe2fc967e8757b8cef710b7ae89c ] [Really, really add flowclip to .cabal file. Ketil Malde **20110126155149 Ignore-this: ffd8674dae2ae0909fa5b7987247a6e ] [avoid n+k pattern Ketil Malde **20110207085722 Ignore-this: 19fd0a2d61b2020c81fc2285a90146d2 ] [Get with the times - remove glasgow-exts, add ParListComp pragma Ketil Malde **20110211125847 Ignore-this: 87e308df054d4f689ba4dbcc76cf3f91 ] [Avoid unfortunate (re)naming that broke the type system. Ketil Malde **20110211130335 Ignore-this: ba962547f9d4e878f6915fd217d4f911 ] [zero-pad flowgrams when trimming a read. Ketil Malde **20110221141643 Ignore-this: 6014e1146f8ec0303a56984c1b4178b6 ] [Added 'orf' example. Ketil Malde **20110309131347 Ignore-this: 2ed544ea76513ac4e156395c6348c9b7 ] [Add comments, clean up a bit, incorporate first STP into the ORF Ketil Malde **20110309133804 Ignore-this: 54eaf53e0399dd5b778b713e7e912f8f ] [Added rselect-pe (selects randomly from Illumina paired ends) Ketil Malde **20110318115328 Ignore-this: 290e5ec2ce194907a52190f5aeb00996 ] [bump to 0.5.0.1 Ketil Malde **20110321173133 Ignore-this: 605064b188a00e4def8e8ab3b95e25cd ] [TAG 0.5.0.1 Ketil Malde **20110321173143 Ignore-this: 67598e0e70a8b81fc05cb71d9d50dc1b ] [bump to 0.5.1 (resolve conflict) Ketil Malde **20110321173407 Ignore-this: de4cc81772bc7a829209e2b7b997c7a5 ] [Allow empty sequence id's. Ketil Malde **20110407092930 Ignore-this: 290002427f9bb1126fd09b49f0120777 ] [Allow fastq without sequence label before quality (empty + lines) Ketil Malde **20110407092958 Ignore-this: f54ce3ea4cf6a91cd65c95cb4bb963b ] [Informative error message in case somebody tries to hash with too Ketil Malde **20110512122421 Ignore-this: c3b2dfd8700a67ef403d6f039a0ab608 large k. ] [No longer dangerous - well, at least you get an error Ketil Malde **20110513120834 Ignore-this: cb11f95b2d59e54ecf4496285c9c4007 ] [Add QC to deps, otherwise build fails. Ketil Malde **20110713104411 Ignore-this: eff828902f52430365cc6a8b47822014 ] [I'd like to be able to read the tags that are added to the end of .phd files by programs like consed and polyphred. They look like this: dfornika@gmail.com**20110926230614 Ignore-this: d6b47cfde3e591a14cf9beaaa71bd2ce BEGIN_TAG TYPE: heterozygoteCT SOURCE: polyPhred UNPADDED_READ_POS: 98 98 DATE: 09/22/11 15:17:47 BEGIN_COMMENT 99 END_COMMENT END_TAG The first step is to stop the 'mkPhd' function in Phd.hs from continuing to read data beyond the 'END_DNA' line into the Sequence Nuc output. I've accomplished this by just adding one extra step that breaks the sd bytestring down to a sd' bytestring and a td (tag data) bytestring. I've tried my patch out in ghci, by reading an example .phd file (with tags) and the output went from this: ID ------------------------------------------------------------------ ca-21.s COMMENT ------------------------------------------------------------- CHROMAT_FILE: ca-21.s ABI_THUMBPRINT: 056131235240000327232065337334 PHRED_VERSION: 0.990722.g CALL_METHOD: phred QUALITY_LEVELS: 99 TIME: Thu Apr 6 09:53:26 2000 TRACE_ARRAY_MIN_INDEX: 0 TRACE_ARRAY_MAX_IND EX: 9244 TRIM: 25 439 0.0500 CHEM: prim DYE: rhod DATA ---------------------------------------------------------------- 0 tgcgcgtgta tatgatgtaa tgtctctttc tactgagcta agagagccgt actggggaga . . . 720 catcaactan aataaacaac attaaUNPAD DED_READ_P OS:DATE:UN PADDED_REA 780 D_POS:DATE :UNPADDED_ READ_POS:D ATE:UNPADD ED_READ_PO S:DATE:UNP 840 ADDED_READ _POS:DATE: UNPADDED_R EAD_POS:DA TE:UNPADDE D_READ_POS 900 :DATE: to this: ID ------------------------------------------------------------------ ca-21.s COMMENT ------------------------------------------------------------- CHROMAT_FILE: ca-21.s ABI_THUMBPRINT: 056131235240000327232065337334 PHRED_VERSION: 0.990722.g CALL_METHOD: phred QUALITY_LEVELS: 99 TIME: Thu Apr 6 09:53:26 2000 TRACE_ARRAY_MIN_INDEX: 0 TRACE_ARRAY_MAX_IND EX: 9244 TRIM: 25 439 0.0500 CHEM: prim DYE: rhod DATA ---------------------------------------------------------------- 0 tgcgcgtgta tatgatgtaa tgtctctttc tactgagcta agagagccgt actggggaga 60 gaggacatga ggaggttaca cggtgagaga agtacaatac aaggcatgtg gtgtacacaa 120 gtgtctctct tcctccactc attagctgtg tggcctcagt ttcttcgttt ctaaacaact 180 gttcttccta cctcataggg gagaagttca actgaggcgg ctgaaatgag gatattttac 240 agcgtttcag gccattatta ttgcacaaga ggaatttgta acaggaagaa gtaggagagt 300 ttggtgggct caggctttgg acttaaaagc aagaccctgc agggaggttt gctctcagag 360 cttaggatgc acgcagagac ccgcgtccct aaaccccact cccagcttca aggcccctca 420 cctcagctgg accacagccg cactcgtata ctggtcgaaa aaaaaacacc cactaacaca 480 ccacaaccca taactattac cacaaaaaat tcactaacaa acacaccaat accatataac 540 taattattct cttcaccaca acataaaaac tataaataaa aacaccaaaa cataaacaca 600 aatacaataa cacaacaaan ttcacataaa taaacaaaac aatcaccaca ttaaatacac 660 aataaaacac acaatcaaac cacaataaaa acaaaactaa tataatcaac aacataatca 720 catcaactan aataaacaac attaa Looks good to me! I don't see how this would break anything, but of course that's up to the package maintainer(s) to decide! Stop mkPhd from reading extra tag data from .phd files into Sequence Nuc data structure. ] [Conditional options, depending on ghc version. Ketil Malde **20111215113517 Ignore-this: 2707fe57c811731ac80b0e4c280a35b5 ]