diff --git a/README.md b/README.md index 4b519eb..dcbd044 100644 --- a/README.md +++ b/README.md @@ -1,6 +1,6 @@ novel-stats produces word count statistics for novels written in Markdown -format, including total word count, per-chapter word counts, per-act word -counts, and counts by chapter "status." You might find this useful if you're +format, including total word count, word count by status, and optionally +per-chapter and per-act word counts. You might find this useful if you're already using tools like Git and Markdown processing as part of your writing workflow (or are looking to start) and want some basic statistics about your novel as you're writing it. @@ -9,24 +9,52 @@ novel-stats is fairly particular about the format of the novel and doesn't currently include much in the way of error checking. Word counts may not be exact. -Example output: +Example output with no optional data: ```bash $ novel-stats example.md -chapter 1: 103 words (drafted) -chapter 2: 83 words (dev edited) -chapter 3: 115 words -chapter 4: 96 words -chapter 5: 136 words (drafted) - -act 1: 187 words (~34%) -act 2: 212 words (~39%) -act 3: 137 words (~25%) -drafted: 239 words (~44%) -dev edited: 83 words (~15%) +drafted: 237 words (~43%) +dev edited: 82 words (~15%) total: 539 words ``` +Example output with chapter data: + +```bash +$ novel-stats example.md -c +chapter 1: 103 (drafted) +chapter 2: 83 (dev edited) +chapter 3: 115 +chapter 4: 96 +chapter 5: 136 (drafted) + +drafted: 237 words (~43%) +dev edited: 82 words (~15%) +total: 539 words +``` + +Example with multi-file markdown: + +```bash +$ novel-stats multi_file.mdpp -pp -c -a +chapter 1 Lorem: + 203 (drafted) + 303 (dev edited) + 506 words (total) +chapter 2 Ipsum: 84 (dev edited) +chapter 3 Dolor: 116 +chapter 4 Sit: 97 +chapter 5 Amet: 137 (drafted) + +act 1: 591 words (~62%) +act 2: 214 words (~22%) +act 3: 138 words (~14%) + +drafted: 336 words (~35%) +dev edited: 385 words (~40%) +total: 946 words +``` + ## Installation Start by cloning the project with git. Then install it with Python's `pip`. @@ -43,16 +71,24 @@ easier): pip3 install --editable /path/to/novel-stats ``` - ## Usage novel-stats takes a single argument: The path to your novel file in markdown format. For instance: ```bash -novel-stats /path/to/your/novel.md +novel-stats /path/to/your/novel.md[pp] [-c/--chapter] [-a/--act] [-pp] ``` +### Optional flags + +* -c or --chapter — output chapter-by-chapter breakdown of word counts, +including how many words in each chapter are tagged with which status +* -a or --act — output act-by-act breakdown of word counts (total only) +* --pp — run markdown pre-processor, this allows for a multi-file input +(e.g. each chapter in its own file), but requires the MarkdownPP python +library. + ## Markdown format You'll need to format your novel in the expected format for novel-stats to @@ -126,11 +162,43 @@ If you do use this feature, you should set the status at the top of each chapter, before the actual chapter contents (and after any chapter status). +### Comments + +Comments, such as outlining notes for yourself, can be added anywhere using: + +```yaml +[//]: # This text is completely ignored. +``` + +These words will not count towards the word count + + +### Multi-file support + +Splitting your novel into multiple files is supported using the `MarkdownPP` +python library. To include a secondary file inside the main one, simply use + +```yaml +!INCLUDE "OtherFile.md" +``` + +and add the `-pp` flag to novel-stats. + ### Example novel -novel-stats includes an example Markdown file `example.md` that illustrates -the expected Markdown format. Try it out: +novel-stats includes two examples: +1. Markdown file `example.md` that illustrates the expected Markdown format +for a single file. Try it out: + +```bash +$ novel-stats example.md ``` -novel-stats example.md + +2. A 6 file example in the `example` folder with the main file +`multi_file.mdpp`. You can try this one out with + +```bash +$ cd example +$ novel-stats multi_file.mdpp -pp ``` diff --git a/example/Chapter1.md b/example/Chapter1.md new file mode 100644 index 0000000..4017814 --- /dev/null +++ b/example/Chapter1.md @@ -0,0 +1,17 @@ +## 1 Lorem + +[status]: # (drafted) +[act]: # (1) + +*Lorem* ipsum dolor sit amet, consectetur adipiscing elit. Ut cursus malesuada leo. Phasellus justo orci, auctor ac maximus vitae, aliquet ornare urna. Etiam porttitor tristique ligula, et dictum mauris consequat vel. Curabitur fringilla velit posuere, imperdiet mauris auctor, varius nibh. Vestibulum sed mauris maximus, vehicula leo sit amet, sodales enim. Maecenas tempor nibh nec egestas aliquam. Proin non nibh eget tellus porttitor pharetra. Phasellus hendrerit, nunc quis lobortis finibus, lacus massa lobortis justo, sit amet vulputate urna magna sit amet dui. Ut facilisis sem orci, sit amet dignissim ligula rutrum quis. Nulla iaculis urna eget varius pellentesque. Nulla pulvinar orci sollicitudin consequat volutpat. Nullam tempus lectus sed est lacinia, et blandit odio tempor. In in quam luctus, convallis sem nec, dapibus elit. Nunc ornare, neque sodales maximus faucibus, lectus velit tincidunt elit, eu blandit nulla turpis sit amet ex. Curabitur ullamcorper mi non quam pharetra, eget cursus sem dapibus. + +**Nullam** ac elementum arcu, eu congue orci. Sed blandit quam non vulputate porta. Donec laoreet metus sit amet ex feugiat, in scelerisque est varius. Curabitur nec elit vel ante consequat gravida. ***Aliquam* ultrices** dolor vel eros hendrerit condimentum. Donec efficitur turpis quis eros viverra venenatis. Praesent ultricies dolor nec justo consectetur consectetur. + +[status]: # (dev edited) + +***Maecenas* nec** mi sapien. Vestibulum tortor tortor, feugiat in est nec, vestibulum faucibus magna. Pellentesque elementum elit sed metus ornare lobortis. Nunc molestie, justo id ultricies elementum, nibh libero suscipit massa, feugiat pharetra felis mi ut lacus. Pellentesque ornare pretium mi, in commodo nulla dignissim vel. Lorem ipsum dolor sit amet, consectetur adipiscing elit. Vestibulum ante ipsum primis in faucibus orci luctus et ultrices posuere cubilia curae; Vivamus sed dolor ut mi mattis sagittis vitae ac dolor. Integer tincidunt diam sapien, vitae tincidunt neque semper sit amet. Cras mi risus, faucibus et lacinia et, eleifend sed nunc. Sed faucibus consectetur justo, non accumsan orci imperdiet quis. + +Sed sed porta ante. Sed viverra dui sit amet eros rutrum volutpat. Aliquam eu nulla congue, cursus lectus sit amet, congue tellus. Maecenas id aliquam libero. Maecenas ultrices blandit aliquam. Sed pretium ut ipsum eu pharetra. Nullam at nunc vitae erat luctus varius non auctor felis. Ut vel dignissim nibh, sit amet gravida mauris. Orci varius natoque penatibus et magnis dis parturient montes, nascetur ridiculus mus. Curabitur id lobortis erat. Etiam facilisis turpis ut libero cursus, sit amet cursus diam hendrerit. Sed nisi lacus, semper quis ante ac, dictum ullamcorper lacus. Aenean vel gravida mauris, a egestas nisi. + +Quisque fermentum sagittis mi. Aliquam erat volutpat. Sed vehicula quam non nunc porta sagittis. Phasellus eu dignissim arcu, non volutpat quam. Vestibulum aliquam leo eget justo pulvinar placerat. Interdum et malesuada fames ac ante ipsum primis in faucibus. Sed sed lacus tempus, tempus dui sed, lobortis magna. Vivamus dui eros, eleifend id ipsum eget, tincidunt porta metus. Pellentesque efficitur pharetra arcu nec auctor. Pellentesque euismod tincidunt risus, vel blandit leo iaculis et. Sed pellentesque lectus nisi, at faucibus purus laoreet in. Curabitur rhoncus lobortis blandit. Nulla et imperdiet risus, eu facilisis arcu. Ut tincidunt justo in eros vehicula feugiat. + diff --git a/example/Chapter2.md b/example/Chapter2.md new file mode 100644 index 0000000..3193625 --- /dev/null +++ b/example/Chapter2.md @@ -0,0 +1,16 @@ +## 2 Ipsum + +[status]: # (dev edited) + +Nullam id cursus velit, et lobortis est. Sed consequat diam risus, ac +hendrerit mauris facilisis vitae. Vestibulum blandit enim nibh, ut vehicula +augue hendrerit sit amet. Proin gravida elit quis erat dapibus ornare. Etiam +suscipit eget tortor eget facilisis. Phasellus finibus nunc quis urna +ultricies elementum. Quisque faucibus pharetra augue eu consectetur. Proin +vehicula, nisl ac maximus volutpat, turpis orci imperdiet quam, ac tempor erat +lectus in leo. Nullam et efficitur ipsum. Nulla felis turpis, blandit ultrices +eros venenatis, sagittis convallis lectus. + + +[//]: # (Testing out a comment) + diff --git a/example/Chapter3.md b/example/Chapter3.md new file mode 100644 index 0000000..8fe543a --- /dev/null +++ b/example/Chapter3.md @@ -0,0 +1,17 @@ +## 3 Dolor + +[act]: # (2) + +Mauris eu orci at velit scelerisque feugiat nec tristique nunc. Curabitur vel +dolor imperdiet, iaculis nunc sit amet, volutpat sapien. Phasellus enim ipsum, +varius a sollicitudin a, dapibus id magna. Etiam vitae sollicitudin orci. +Vivamus dapibus lacinia risus eu pellentesque. Lorem ipsum dolor sit amet, +consectetur adipiscing elit. Donec ut nisl non mi suscipit scelerisque. +Curabitur quis accumsan velit, ac convallis lectus. Curabitur aliquet nisi et +magna tincidunt, in euismod orci rutrum. Aliquam sed erat eget ipsum +sollicitudin mollis. Donec accumsan euismod rhoncus. Proin molestie ut mauris +quis egestas. Sed cursus varius leo at suscipit. Aenean ultricies sodales mi, +non varius ex laoreet quis. Morbi a nisl fringilla lorem mollis consequat et +id ex. + + diff --git a/example/Chapter4.md b/example/Chapter4.md new file mode 100644 index 0000000..50e0eb3 --- /dev/null +++ b/example/Chapter4.md @@ -0,0 +1,13 @@ +## 4 Sit + +Sed eget metus tristique, tincidunt purus non, euismod massa. Class aptent +taciti sociosqu ad litora torquent per conubia nostra, per inceptos himenaeos. +Vestibulum egestas scelerisque neque. Pellentesque et ultrices lorem, at +mattis enim. Nam sit amet quam sapien. Maecenas fringilla nisl sit amet ipsum +feugiat condimentum. Praesent ac justo placerat, ornare lectus quis, volutpat +turpis. Etiam eu blandit nibh. Sed tincidunt facilisis massa vitae mattis. +Pellentesque vitae lectus et massa sollicitudin varius. Proin varius libero eu +elit mollis egestas. Ut interdum lacus tempor velit ullamcorper, quis +consequat tortor lobortis. Donec pulvinar pretium quam eu fringilla. + + diff --git a/example/Chapter5.md b/example/Chapter5.md new file mode 100644 index 0000000..525cf45 --- /dev/null +++ b/example/Chapter5.md @@ -0,0 +1,19 @@ +## 5 Amet + +[status]: # (drafted) +[act]: # (3) + +Cras eget egestas enim. Donec faucibus lacus malesuada magna bibendum, eget +molestie purus gravida. Vivamus leo erat, dapibus non tristique a, fringilla +eget felis. Phasellus efficitur, nibh eu sollicitudin tristique, urna tellus +ultricies ligula, sit amet facilisis libero risus sodales dolor. Quisque nec +tortor a ligula porttitor egestas et vel dui. Integer lorem sem, luctus vel +enim ac, rhoncus vestibulum urna. Maecenas eu sem id eros interdum congue. +Nunc quis turpis id nibh aliquam varius eget eget tortor. Morbi faucibus nisi +sit amet arcu sollicitudin, sit amet luctus lorem pulvinar. Aliquam velit +nulla, viverra a turpis eget, venenatis hendrerit sapien. Aliquam a sem +vehicula, tempor purus non, fringilla felis. Ut venenatis massa lacus, et +malesuada leo vehicula vitae. In vel nunc id metus semper ornare. Duis quis +tellus eleifend, tristique ex sit amet, mattis ligula. + + diff --git a/example/multi_file.mdpp b/example/multi_file.mdpp new file mode 100644 index 0000000..2685f1b --- /dev/null +++ b/example/multi_file.mdpp @@ -0,0 +1,8 @@ +# Title of the Novel +### Author Name + +!INCLUDE "Chapter1.md" +!INCLUDE "Chapter2.md" +!INCLUDE "Chapter3.md" +!INCLUDE "Chapter4.md" +!INCLUDE "Chapter5.md" \ No newline at end of file diff --git a/novel_stats/novel_stats.py b/novel_stats/novel_stats.py index 3bc1034..9b88db4 100755 --- a/novel_stats/novel_stats.py +++ b/novel_stats/novel_stats.py @@ -2,14 +2,13 @@ import collections -import os -import string import sys CHAPTER_MARKER = '## ' STATUS_MARKER = '[status]: # ' ACT_MARKER = '[act]: # ' +COMMENT_MARKER = '[//]: # ' # Strandard markdown comment marker, supported by pandoc and calibre's ebook-convert def count_words(line): @@ -27,71 +26,96 @@ def count_words(line): def main(): arguments = sys.argv[1:] filename = arguments[0] - chapter_number = None - act_number = None + mdfile = None + + if '-pp' in arguments: + # -pp flag to allow Markdown Preprocessing primarily to allow multi-file novel formatting + # this is implemented using a temporary file created using python's buit-in tempfile library + import MarkdownPP, tempfile + mdfile = tempfile.TemporaryFile(mode='w+') + MarkdownPP.MarkdownPP(input=open(filename), output=mdfile, modules=list(MarkdownPP.modules)) + mdfile.seek(0) + else: + mdfile = open(filename) + + chapter_heading = None + act_heading = None total_word_count = 0 word_count_by_chapter = collections.defaultdict(int) word_count_by_status = collections.defaultdict(int) word_count_by_act = collections.defaultdict(int) status_by_chapter = {} + current_status = None - for line in open(filename).readlines(): + for line in mdfile.readlines(): if line.startswith(CHAPTER_MARKER): - word_count_by_act[act_number] += word_count_by_chapter[chapter_number] - total_word_count += word_count_by_chapter[chapter_number] - if chapter_number in status_by_chapter: - word_count_by_status[status_by_chapter[chapter_number]] += 1 + word_count_by_act[act_heading] += word_count_by_chapter[chapter_heading] + total_word_count += word_count_by_chapter[chapter_heading] - chapter_number = int(line[len(CHAPTER_MARKER):]) + chapter_heading = line[len(CHAPTER_MARKER):].strip('()\n') - word_count_by_chapter[chapter_number] = 1 # Start at one, because the chapter number itself counts as a word. - if chapter_number in status_by_chapter: - word_count_by_status[chapter_status] += 1 - elif line.startswith(STATUS_MARKER): - status_by_chapter[chapter_number] = line[len(STATUS_MARKER):].strip('()\n') + word_count_by_chapter[chapter_heading] = count_words(chapter_heading) # Count the words in chapter heading, because the chapter number and title count as words. + + status_by_chapter[chapter_heading] = collections.defaultdict(int) + current_status = None + elif line.startswith(STATUS_MARKER): # Modified to allow multiple statuses in a single chapter, can swap back and forth. + if current_status == None: + current_status = line[len(STATUS_MARKER):].strip('()\n') + status_by_chapter[chapter_heading][current_status] = count_words(chapter_heading) + else: + current_status = line[len(STATUS_MARKER):].strip('()\n') + status_by_chapter[chapter_heading][current_status] += 0 elif line.startswith(ACT_MARKER): - act_number = int(line[len(ACT_MARKER):].strip('()\n')) - word_count_by_act[act_number] = 1 + act_heading = line[len(ACT_MARKER):].strip('()\n') + word_count_by_act[act_heading] = count_words(act_heading) + elif line.startswith(COMMENT_MARKER): # don't count the words in a comment + pass else: line_word_count = count_words(line) - word_count_by_chapter[chapter_number] += line_word_count + word_count_by_chapter[chapter_heading] += line_word_count - if chapter_number in status_by_chapter: - word_count_by_status[status_by_chapter[chapter_number]] += line_word_count + if current_status: + word_count_by_status[current_status] += line_word_count + status_by_chapter[chapter_heading][current_status] += line_word_count + + mdfile.close() # Do some final accounting after the last chapter. - word_count_by_act[act_number] += word_count_by_chapter[chapter_number] - total_word_count += word_count_by_chapter[chapter_number] - if chapter_number in status_by_chapter: - word_count_by_status[status_by_chapter[chapter_number]] += 1 + word_count_by_act[act_heading] += word_count_by_chapter[chapter_heading] + total_word_count += word_count_by_chapter[chapter_heading] - # Print out word counts. - for chapter_number, chapter_word_count in word_count_by_chapter.items(): - if chapter_number is None: - continue + if '-c' in arguments or '--chapter' in arguments: # -c or --chapter to give a chapter-by-chapter word count summary + for chapter_heading, chapter_word_count in word_count_by_chapter.items(): + if chapter_heading is None: + continue - chapter_status = status_by_chapter.get(chapter_number) + if len(status_by_chapter[chapter_heading]) > 1: + print(f'chapter {chapter_heading}:') - print( - 'chapter {}: {:,} words{}'.format( - chapter_number, - chapter_word_count, - ' ({})'.format(chapter_status) if chapter_status else '', - ) - ) + for chapter_status, status_count in status_by_chapter[chapter_heading].items(): + print(f'\t {status_count:,} ({chapter_status})') + print(f'\t {chapter_word_count:,} words (total)') + elif len(status_by_chapter[chapter_heading]) == 1: + chapter_status = list(status_by_chapter[chapter_heading].keys())[0] + print(f'chapter {chapter_heading}: {chapter_word_count:,} ({chapter_status})') + else: + print(f'chapter {chapter_heading}: {chapter_word_count:,}') - print() + print() - for act_number, act_word_count in word_count_by_act.items(): - if act_number is None: - continue + if '-a' in arguments or '--act' in arguments: # -a or --act to give an act-by-act word count summary + for act_heading, act_word_count in word_count_by_act.items(): + if act_heading is None: + continue - print('act {}: {:,} words (~{}%)'.format(act_number, act_word_count, act_word_count * 100 // total_word_count)) + print('act {}: {:,} words (~{}%)'.format(act_heading, act_word_count, act_word_count * 100 // total_word_count)) + + print() for status, status_word_count in word_count_by_status.items(): - print('{}: {:,} words (~{}%)'.format(status, status_word_count, status_word_count * 100 // total_word_count)) + print(f'{status}: {status_word_count:,} words (~{status_word_count * 100 // total_word_count}%)') - print('total: {:,} words'.format(total_word_count)) + print(f'total: {total_word_count:,} words') if __name__ == '__main__':