From 2b5d6dc08079aa2f2ec7b55eb5fc2998fb34be7c Mon Sep 17 00:00:00 2001
From: A1exRey <alexrey2@yandex.ru>
Date: Tue, 30 Oct 2018 13:00:35 +0300
Subject: [PATCH 01/16] Create quiz-1-response.md
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Добавлен ответ на тест
---
 .../quizzes/quiz-1/quiz-1-response.md         | 51 +++++++++++++++++++
 1 file changed, 51 insertions(+)
 create mode 100644 2018-komp-ling/quizzes/quiz-1/quiz-1-response.md
diff --git a/2018-komp-ling/quizzes/quiz-1/quiz-1-response.md b/2018-komp-ling/quizzes/quiz-1/quiz-1-response.md
new file mode 100644
index 00000000..714bc52c
--- /dev/null
+++ b/2018-komp-ling/quizzes/quiz-1/quiz-1-response.md
@@ -0,0 +1,51 @@
+<!--
+SPDX-License-Identifier: (CC-BY-SA-4.0 OR GFDL-1.3-or-later)
+Copyright 2018 Nick Howell
+-->
+
+<div style="column-width: 30em">
+
+# Quiz 1
+
+1. Which problems does maxmatch suffer from? (Choose all that
+   apply.)
+
+      <b>a) requires comprehensive dictionary</b>
+   
+      <b>d) constructs non-grammatical sentences</b>
+
+2. Write a perl/sed substitution with regular expressions that
+   adds whitespace for segmentation around "/" in "either/or"
+   expressions but not around fractions "1/2":
+   Answer:
+   
+        sed 's/[[:alpha:]][/][[:alpha:]]/ \/ /'
+
+3. the text mentions several times that machine learning
+   techniques produce better segmentation than rule-based
+   systems; what are some downsides of machine learning
+   techniques compared to rule-based?
+   Answer:
+		1) model overfitting
+		2) Mono-language
+		3) Impossibility of interpretation
+
+4. write a sentence (in English or in Russian) which maxmatch
+   segments incorrectly.
+   Answer:
+   
+		При правовых вопросах
+    
+		Приправовыхвопросах
+    
+		Приправ о вы х вопросах
+   
+
+5. what are problems for sentence segmentation? provide one
+   example in English or Russian for each that applies.
+
+      <b>a) ambiguous abbrevations with punctuation<b>
+
+      <b>c) sentences lacking separating punctuation<b>
+
+</div>

From fb67124ce613e9c769745371a14f91f4ce0f0578 Mon Sep 17 00:00:00 2001
From: A1exRey <alexrey2@yandex.ru>
Date: Tue, 30 Oct 2018 13:40:37 +0300
Subject: [PATCH 02/16] added segmentaion report

---
 .../segmentation/segmentation-response.md     | 19 +++++++++++++++++++
 1 file changed, 19 insertions(+)
 create mode 100644 2018-komp-ling/practicals/segmentation/segmentation-response.md

diff --git a/2018-komp-ling/practicals/segmentation/segmentation-response.md b/2018-komp-ling/practicals/segmentation/segmentation-response.md
new file mode 100644
index 00000000..fa2dc210
--- /dev/null
+++ b/2018-komp-ling/practicals/segmentation/segmentation-response.md
@@ -0,0 +1,19 @@
+<!--
+SPDX-License-Identifier: (CC-BY-SA-4.0 OR GFDL-1.3-or-later)
+Copyright 2018 Nick Howell
+-->
+
+<div style="column-width: 30em">
+
+<h1> Обзор двух библиотек по токенизированию предложений  </h1>
+
+В данном отчете было использованы две библиотеки по токенеизорванию спредложений из текста: pragmatic segmenter (Ruby) и NLTK (Python). В качестве тестового текста был использован кусок дампа русской Википедии.
+<h2> Pragmatic segmenter (Ruby) </h2>
+Pragmatic segmenter - это бибилотека для Ruby, основанная на правилах. При парсинге русской википедии данная библиотека показала качество ниже среднего. Большинство сокращений, инициалы имен и т.д. неправильно были разделены на предложения.
+В общем библиотека больше расчитана на языки латинского алфавита.
+
+<h2> NLTK (Python) </h2>
+
+sent_tokenize() - это функция библиотеки NLTK по определению границ предложения. Но на самом деле это алгоритм машинного обучения без учителя, который можно обучить самомстоятельно. В бибилотеке NLTK уже есть набор pre-trained моделей, в том числе и для русского языка. В общем данная библиотека показала себя лучше, чем Ruby. Большинство сокращений и инициалов выделены правильно, единтсвенную проблему составляет сокращения с пробелами внутри.
+
+</div>

From 5f7229aace713700997c36d77e010b29659f8c01 Mon Sep 17 00:00:00 2001
From: A1exRey <alexrey2@yandex.ru>
Date: Mon, 12 Nov 2018 21:39:13 +0300
Subject: [PATCH 03/16] =?UTF-8?q?=D0=94=D0=BE=D0=B1=D0=B0=D0=B2=D0=BB?=
 =?UTF-8?q?=D0=B5=D0=BD=D0=BE=20=D0=BE=D0=BF=D0=B8=D1=81=D0=B0=D0=BD=D0=B8?=
 =?UTF-8?q?=D0=B5=20=D0=B4=D0=BB=D1=8F=20=D1=82=D1=80=D0=B0=D0=BD=D1=81?=
 =?UTF-8?q?=D0=BB=D0=B8=D1=82=D0=B5=D1=80=D0=B0=D1=86=D0=B8=D0=B8?=
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

---
 .../practicals/transliteration-response.md    | 42 +++++++++++++++++++
 1 file changed, 42 insertions(+)
 create mode 100644 2018-komp-ling/practicals/transliteration-response.md

diff --git a/2018-komp-ling/practicals/transliteration-response.md b/2018-komp-ling/practicals/transliteration-response.md
new file mode 100644
index 00000000..6532626d
--- /dev/null
+++ b/2018-komp-ling/practicals/transliteration-response.md
@@ -0,0 +1,42 @@
+# Practical 2: Transliteration (engineering)
+
+<div style="column-width: 30em">
+
+## Questions
+What to do with ambiguous letters ? For example, Cyrillic `е' could be either je or e.
+
+Can you think of a way that you could provide mappings from many characters to one character ?
+        For example sh → ш or дж → c ?
+How might you make different mapping rules for characters at the beginning or end of the string ?
+
+
+### Правила для транслитерации
+
+Основная идея - это начинать транслитерацию со сложных, многобуквенных преобразований (ч - tch). Например:
+>Шарик -- sh-арик -- sharik
+
+Далее нужно заменить все гласные в начале и в конце слова (Я - ya). 
+>яблоко -- ya-блоко -- yabloko
+
+После чего уже можно переходить на простые однобуквенные преобразования (у - u)
+>мед -- med
+
+## Методы
+### Кодировка-декодировка с помощью KOI-8R
+
+Транслитерация с помощью кодировки KOI-8R - не самый эффективный метод транслитерации текста. 
+Но он обеспечивает некоторые особенности, которые не доступны другим способам:
+
+    1)Возможность восстановить первоначальный текст
+
+    2)Правила кодировки уже заданы
+
+Метод транслитерации с помощью KOI-8R представлен в файле transliterate_koi8r.py
+
+### Кодировка-декодировка с помощью правил
+
+Правила для транслетерации находятся в файле rules.txt.
+ В нем заданы правила для согланых, гласных, а также для гласных в начале слова.
+
+
+</div>

From 9bed2a385c85c82a66cee86f8c26e6338482e71d Mon Sep 17 00:00:00 2001
From: A1exRey <alexrey2@yandex.ru>
Date: Mon, 12 Nov 2018 21:41:36 +0300
Subject: [PATCH 04/16] Create transliteration-response.md

---
 .../transliteration-response.md               | 42 +++++++++++++++++++
 1 file changed, 42 insertions(+)
 create mode 100644 2018-komp-ling/practicals/transliteration/transliteration-response.md

diff --git a/2018-komp-ling/practicals/transliteration/transliteration-response.md b/2018-komp-ling/practicals/transliteration/transliteration-response.md
new file mode 100644
index 00000000..6532626d
--- /dev/null
+++ b/2018-komp-ling/practicals/transliteration/transliteration-response.md
@@ -0,0 +1,42 @@
+# Practical 2: Transliteration (engineering)
+
+<div style="column-width: 30em">
+
+## Questions
+What to do with ambiguous letters ? For example, Cyrillic `е' could be either je or e.
+
+Can you think of a way that you could provide mappings from many characters to one character ?
+        For example sh → ш or дж → c ?
+How might you make different mapping rules for characters at the beginning or end of the string ?
+
+
+### Правила для транслитерации
+
+Основная идея - это начинать транслитерацию со сложных, многобуквенных преобразований (ч - tch). Например:
+>Шарик -- sh-арик -- sharik
+
+Далее нужно заменить все гласные в начале и в конце слова (Я - ya). 
+>яблоко -- ya-блоко -- yabloko
+
+После чего уже можно переходить на простые однобуквенные преобразования (у - u)
+>мед -- med
+
+## Методы
+### Кодировка-декодировка с помощью KOI-8R
+
+Транслитерация с помощью кодировки KOI-8R - не самый эффективный метод транслитерации текста. 
+Но он обеспечивает некоторые особенности, которые не доступны другим способам:
+
+    1)Возможность восстановить первоначальный текст
+
+    2)Правила кодировки уже заданы
+
+Метод транслитерации с помощью KOI-8R представлен в файле transliterate_koi8r.py
+
+### Кодировка-декодировка с помощью правил
+
+Правила для транслетерации находятся в файле rules.txt.
+ В нем заданы правила для согланых, гласных, а также для гласных в начале слова.
+
+
+</div>

From 0f4e3e17d4e1e80d5f08234306b1e1f01a69252b Mon Sep 17 00:00:00 2001
From: A1exRey <alexrey2@yandex.ru>
Date: Mon, 12 Nov 2018 21:41:54 +0300
Subject: [PATCH 05/16] Delete transliteration-response.md

---
 .../practicals/transliteration-response.md    | 42 -------------------
 1 file changed, 42 deletions(-)
 delete mode 100644 2018-komp-ling/practicals/transliteration-response.md

diff --git a/2018-komp-ling/practicals/transliteration-response.md b/2018-komp-ling/practicals/transliteration-response.md
deleted file mode 100644
index 6532626d..00000000
--- a/2018-komp-ling/practicals/transliteration-response.md
+++ /dev/null
@@ -1,42 +0,0 @@
-# Practical 2: Transliteration (engineering)
-
-<div style="column-width: 30em">
-
-## Questions
-What to do with ambiguous letters ? For example, Cyrillic `е' could be either je or e.
-
-Can you think of a way that you could provide mappings from many characters to one character ?
-        For example sh → ш or дж → c ?
-How might you make different mapping rules for characters at the beginning or end of the string ?
-
-
-### Правила для транслитерации
-
-Основная идея - это начинать транслитерацию со сложных, многобуквенных преобразований (ч - tch). Например:
->Шарик -- sh-арик -- sharik
-
-Далее нужно заменить все гласные в начале и в конце слова (Я - ya). 
->яблоко -- ya-блоко -- yabloko
-
-После чего уже можно переходить на простые однобуквенные преобразования (у - u)
->мед -- med
-
-## Методы
-### Кодировка-декодировка с помощью KOI-8R
-
-Транслитерация с помощью кодировки KOI-8R - не самый эффективный метод транслитерации текста. 
-Но он обеспечивает некоторые особенности, которые не доступны другим способам:
-
-    1)Возможность восстановить первоначальный текст
-
-    2)Правила кодировки уже заданы
-
-Метод транслитерации с помощью KOI-8R представлен в файле transliterate_koi8r.py
-
-### Кодировка-декодировка с помощью правил
-
-Правила для транслетерации находятся в файле rules.txt.
- В нем заданы правила для согланых, гласных, а также для гласных в начале слова.
-
-
-</div>

From 5dd808654a72165fc2b629db4e5ab4bb3f46979f Mon Sep 17 00:00:00 2001
From: A1exRey <alexrey2@yandex.ru>
Date: Mon, 12 Nov 2018 21:43:51 +0300
Subject: [PATCH 06/16] =?UTF-8?q?=D0=94=D0=BE=D0=BC=D0=B0=D1=88=D0=BD?=
 =?UTF-8?q?=D0=B5=D0=B5=20=D0=B7=D0=B0=D0=B4=D0=B0=D0=BD=D0=B8=D0=B5=202?=
 =?UTF-8?q?=20--=20=D1=82=D1=80=D0=B0=D0=BD=D1=81=D0=BB=D0=B8=D1=82=D0=B5?=
 =?UTF-8?q?=D1=80=D0=B0=D1=86=D0=B8=D1=8F?=
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

---
 .../practicals/transliteration/rank.py        | 44 ++++++++++++++
 .../practicals/transliteration/rules.txt      | 36 ++++++++++++
 .../transliteration/transliterate.py          | 58 +++++++++++++++++++
 .../transliteration/transliterate_koi8r.py    | 49 ++++++++++++++++
 4 files changed, 187 insertions(+)
 create mode 100644 2018-komp-ling/practicals/transliteration/rank.py
 create mode 100644 2018-komp-ling/practicals/transliteration/rules.txt
 create mode 100644 2018-komp-ling/practicals/transliteration/transliterate.py
 create mode 100644 2018-komp-ling/practicals/transliteration/transliterate_koi8r.py

diff --git a/2018-komp-ling/practicals/transliteration/rank.py b/2018-komp-ling/practicals/transliteration/rank.py
new file mode 100644
index 00000000..476ab50c
--- /dev/null
+++ b/2018-komp-ling/practicals/transliteration/rank.py
@@ -0,0 +1,44 @@
+#!/usr/bin/python
+
+import sys, getopt
+
+def main(argv):
+    inputfile = ''
+    outputfile = 'ranked.txt'
+    try:
+        opts, args = getopt.getopt(argv,"hi:o:",["ifile=","ofile="])
+    except getopt.GetoptError:
+        print('test.py -i <inputfile> -o <outputfile>')
+        sys.exit(2)
+    for opt, arg in opts:
+        if opt == '-h':
+            print( 'test.py -i <inputfile> -o <outputfile>')
+            sys.exit()
+        elif opt in ("-i", "--ifile"):
+            inputfile = arg
+        elif opt in ("-o", "--ofile"):
+            outputfile = arg
+    print( 'Input file is:', inputfile)
+    print( 'Output file is:', outputfile)
+    
+    freq = []
+    with open(inputfile, 'r',encoding='utf8') as fd:
+        for line in fd.readlines():
+            line = line.strip('\n')
+            (f, w) = line.split('\t')
+            freq.append((int(f), w)) 
+    rank = 1
+    min = freq[0][0]
+    ranks = []
+    for i in range(0, len(freq)): 
+        if freq[i][0] < min: 
+            rank = rank + 1
+            min = freq[i][0]
+        ranks.append((rank, freq[i][0], freq[i][1]))   
+              
+    with open(outputfile, 'w+',encoding='utf8') as fd:
+        for w in vocab:
+            fd.write(ranks)
+              
+if __name__ == "__main__":
+    main(sys.argv[1:])
diff --git a/2018-komp-ling/practicals/transliteration/rules.txt b/2018-komp-ling/practicals/transliteration/rules.txt
new file mode 100644
index 00000000..a3bdc243
--- /dev/null
+++ b/2018-komp-ling/practicals/transliteration/rules.txt
@@ -0,0 +1,36 @@
+ я	ya
+ ю	yu
+ е	ye
+а	a
+б	b
+в	v
+г	g
+д	d
+е	e
+ё	yo
+ж	zsh
+з	z
+и	i
+й	y
+к	k
+л	l
+м	m
+н	n
+о	o
+п	p
+р	r
+с	s
+т	t
+у	u
+ф	f
+х	h
+ц	ts
+ч	tch
+ш	ch
+щ	scsh
+ъ	'
+ы	uy
+ь	'
+э	a
+ю	u
+я	a
diff --git a/2018-komp-ling/practicals/transliteration/transliterate.py b/2018-komp-ling/practicals/transliteration/transliterate.py
new file mode 100644
index 00000000..73974730
--- /dev/null
+++ b/2018-komp-ling/practicals/transliteration/transliterate.py
@@ -0,0 +1,58 @@
+#!/usr/bin/python
+
+import sys, getopt
+
+def main(argv):
+    inputfile = ''
+    outputfile = '__translitareted.conllu'
+    try:
+        opts, args = getopt.getopt(argv,"hi:o:",["ifile=","ofile="])
+    except getopt.GetoptError:
+        print('test.py -i <inputfile> -o <outputfile>')
+        sys.exit(2)
+    for opt, arg in opts:
+        if opt == '-h':
+            print( 'test.py -i <inputfile> -o <outputfile>')
+            sys.exit()
+        elif opt in ("-i", "--ifile"):
+            inputfile = arg
+        elif opt in ("-o", "--ofile"):
+            outputfile = arg
+    print( 'Input file is:', inputfile)
+    print( 'Output file is:', outputfile)
+    
+    vocab = []
+    test = open(inputfile,'r',encoding='utf8') 
+    for line in test.readlines():
+        if '\t' not in line:
+            continue
+        row = line.replace('\n','').split('\t')
+        if len(row) != 10:
+            continue
+        vocab.append(row)
+    test.close()  
+    
+    for i in enumerate(vocab):
+        try:
+            bar = i[1][1]
+            for i in enumerate(bar):
+				if i[0] == 0:
+					try:
+						bar = bar.replace(i[1],rules_for[' '+i[1]])
+					except: 
+						bar = bar.replace(i[1],rules_for[i[1]])
+				else:
+					try:
+						bar = bar.replace(i[1],rules_for[i[1]])
+					except:
+						pass
+            vocab[i[0]][9] = bar
+        except:
+            continue
+              
+    with open(outputfile, 'w+',encoding='utf8') as fd:
+        for w in vocab:
+            fd.write('\t'.join(w)+'\n')
+              
+if __name__ == "__main__":
+    main(sys.argv[1:])
diff --git a/2018-komp-ling/practicals/transliteration/transliterate_koi8r.py b/2018-komp-ling/practicals/transliteration/transliterate_koi8r.py
new file mode 100644
index 00000000..fe6799f1
--- /dev/null
+++ b/2018-komp-ling/practicals/transliteration/transliterate_koi8r.py
@@ -0,0 +1,49 @@
+#!/usr/bin/python
+
+import sys, getopt
+
+def main(argv):
+    inputfile = ''
+    outputfile = '__translitareted.conllu'
+    try:
+        opts, args = getopt.getopt(argv,"hi:o:",["ifile=","ofile="])
+    except getopt.GetoptError:
+        print('test.py -i <inputfile> -o <outputfile>')
+        sys.exit(2)
+    for opt, arg in opts:
+        if opt == '-h':
+            print( 'test.py -i <inputfile> -o <outputfile>')
+            sys.exit()
+        elif opt in ("-i", "--ifile"):
+            inputfile = arg
+        elif opt in ("-o", "--ofile"):
+            outputfile = arg
+    print( 'Input file is:', inputfile)
+    print( 'Output file is:', outputfile)
+    
+    vocab = []
+    test = open(inputfile,'r',encoding='utf8') 
+    for line in test.readlines():
+        if '\t' not in line:
+            continue
+        row = line.replace('\n','').split('\t')
+        if len(row) != 10:
+            continue
+        vocab.append(row)
+    test.close()  
+    
+    #magic KOI-8r
+    for i in enumerate(vocab):
+        try:
+            oldone = i[1][1].encode('koi8-r')
+            newone = ''.join([chr(c & 0x7F) for c in oldone])
+            vocab[i[0]][9] = newone
+        except:
+            continue
+              
+    with open(outputfile, 'w+',encoding='utf8') as fd:
+        for w in vocab:
+            fd.write('\t'.join(w)+'\n')
+              
+if __name__ == "__main__":
+    main(sys.argv[1:])

From 85cd857f296120bb0ec454c2e8bdfb8f5fd2023d Mon Sep 17 00:00:00 2001
From: A1exRey <alexrey2@yandex.ru>
Date: Mon, 10 Dec 2018 21:48:19 +0300
Subject: [PATCH 07/16] =?UTF-8?q?=D0=94=D0=BE=D0=B1=D0=B0=D0=B2=D0=BB?=
 =?UTF-8?q?=D0=B5=D0=BD=D1=8B=20=D0=BE=D1=82=D0=B2=D0=B5=D1=82=D1=8B=20?=
 =?UTF-8?q?=D0=BD=D0=B0=20=D1=82=D0=B5=D1=81=D1=823?=
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

---
 .../quizzes/quiz3/quiz-3-response.md          | 65 +++++++++++++++++++
 1 file changed, 65 insertions(+)
 create mode 100644 2018-komp-ling/quizzes/quiz3/quiz-3-response.md

diff --git a/2018-komp-ling/quizzes/quiz3/quiz-3-response.md b/2018-komp-ling/quizzes/quiz3/quiz-3-response.md
new file mode 100644
index 00000000..916a0618
--- /dev/null
+++ b/2018-komp-ling/quizzes/quiz3/quiz-3-response.md
@@ -0,0 +1,65 @@
+<!--
+SPDX-License-Identifier: (CC-BY-SA-4.0 OR GFDL-1.3-or-later)
+
+-->
+
+<div style="column-width: 30em">
+
+# Quiz 3
+
+1.	In the reading, it is claimed that to implement a morphological disambiguator for an unseen language, it takes roughly the same amount of time whether annotating a corpus to train on versus writing constraint grammar rules.
+	
+  a)	Give an argument for why constraint grammar rules are more valuable
+  
+  <b>Constraint grammar rules gives us a great precision score, but a recall score is low most of the time. 
+  So, if constraint rules can be simple implemented – we should use constraint grammar. </b>
+	
+  b)	Give an argument for why corpus annotation and HMM training is more valuable
+  
+  <b>Counterwise, HMM gives us a great Recall score (bigger than CG rules), 
+  but  HMM will never reach the Precision level of CG rules</b>
+
+
+2. Can the two systems be used together? Explain.
+
+  <b>Yes. The basis of the grammar is composed of constraint rules. Yet, when rules
+  cannot provide a solution, there is room for the use of elements that contain
+  probabilistic features; this contributes to robustness in the grammar. 
+  So, first of all we should use CG rules, and after that – HMM.</b>
+
+3.	Give a sentence with morphosyntactic ambiguity. 
+What would you expect a disambiguator to do in this situation? What can you do?
+
+
+<b>‘Косой косой косил косой’</b>
+
+In this case disambiguator will give us 
+
+    [A,A,V,N]
+  
+because for a verb there must be a NOUN. 
+But the real PoS tags are 
+
+    [A,N,V,N]
+  
+4.	Choose several (>2) quantities that evaluate the quality of a morphological disambiguator, 
+and describe how to compute them. Describe what it would mean to have disambiguators which 
+differ in quality based on which quantity is used.
+
+  Difficulty in evaluate the proper score for FP,FN e.t.c is what we need to summarize 
+  all the answers for every tag we have. Example: take all NOUN tag from golden standard 
+  and from the answers of our model. If :
+
+•	Standard and answers equal = TP
+
+•	Standard is NOUN, but answer is not = FN
+
+•	Answer is NOUN, but standard is not = FP
+
+•	Not Standard, nor answer is NOUN = TN
+
+Next, we summarize all answers and calculate Precision and Recall.
+
+5.	Give an example where an n-gram HMM performs better than a unigram HMM tagger.
+
+</div>

From f0f2ccf7f9478a66d9f525e8a59b0de42bdbc704 Mon Sep 17 00:00:00 2001
From: A1exRey <alexrey2@yandex.ru>
Date: Sun, 24 Mar 2019 16:21:01 +0300
Subject: [PATCH 08/16] HW 4 is done

---
 .../practicals/unigra_tagger/unigram.md       | 68 +++++++++++++++++++
 1 file changed, 68 insertions(+)
 create mode 100644 2018-komp-ling/practicals/unigra_tagger/unigram.md

diff --git a/2018-komp-ling/practicals/unigra_tagger/unigram.md b/2018-komp-ling/practicals/unigra_tagger/unigram.md
new file mode 100644
index 00000000..b370d769
--- /dev/null
+++ b/2018-komp-ling/practicals/unigra_tagger/unigram.md
@@ -0,0 +1,68 @@
+# matplotlib
+
+Получение рангов для нашего текста
+
+````
+import matplotlib.pyplot as plt
+
+freq = []
+ranks = []
+
+#load data
+with open('./../freq.txt', 'r') as f:
+    f = f.readlines()
+for line in f:
+    line = line.strip('\n')
+    (f, w) = line.split('\t')
+    freq.append((int(f), w))
+
+freq.sort(reverse=True)
+
+#ranking data
+rank = 1
+min = freq[0][0]
+for i in range(0, len(freq)): 
+    if freq[i][0] < min: 
+        rank +=1
+        min = freq[i][0]
+    ranks.append([rank, freq[i][0], freq[i][1]])
+    
+#do the plots
+x = []
+y = []
+for line in ranks:
+    row = line
+    x.append(int(row[0]))
+    y.append(int(row[1]))
+plt.plot(x, y, 'b*')
+plt.show()
+````
+# ElementTree
+
+### How would you get just the Icelandic line and the gloss line ? 
+
+````
+for tier in root.findall('.//tier'):
+    if tier.attrib['id'] == 'n':
+        for item in tier.findall('.//item'):
+            if item.attrib['tag'] != 'T':   # here is the condition
+                print(item.text)
+````
+
+# scikit learn
+
+### Perceptron answers
+````
+- #хоругвь# incorrect class: 0 correct class: 1
+- #обувь# incorrect class: 0 correct class: 1
+- #морковь# incorrect class: 0 correct class: 1
+- #бровь# incorrect class: 0 correct class: 1
+- #церковь# incorrect class: 0 correct class: 1
+0.982857142857142856
+````
+To improve the quiality of our model we should use MLP, or deeper (than 1 layer) models 
+
+# Screenscraping
+
+done in __screencap.py__
+

From 153a22aa893faa37a65b1b6f04609b0f2555f7ed Mon Sep 17 00:00:00 2001
From: A1exRey <alexrey2@yandex.ru>
Date: Sun, 24 Mar 2019 16:21:16 +0300
Subject: [PATCH 09/16] Add files via upload

---
 .../practicals/unigra_tagger/screencap.py     | 43 +++++++++++++++++++
 1 file changed, 43 insertions(+)
 create mode 100644 2018-komp-ling/practicals/unigra_tagger/screencap.py

diff --git a/2018-komp-ling/practicals/unigra_tagger/screencap.py b/2018-komp-ling/practicals/unigra_tagger/screencap.py
new file mode 100644
index 00000000..61ff9c6d
--- /dev/null
+++ b/2018-komp-ling/practicals/unigra_tagger/screencap.py
@@ -0,0 +1,43 @@
+#дерево
+import sys
+
+def strip_html(h):
+    output = ''
+    inTag = False
+    for c in h: 
+        if c == '<':
+            inTag = True
+            continue
+        if c == '>':
+            inTag = False
+            continue
+        if not inTag:
+            output += c
+    return output
+
+stem = '_'
+zkod = '_'
+ipa = '_'
+
+
+h1 = '_'
+for line in sys.stdin.readlines():
+    line = line.strip()
+    text = strip_html(line)
+    if line.count('<h1>') > 0: 
+        h1 = strip_html(line)
+    if h1 != 'Русский': 
+        continue  
+    if text.count('Корень:') > 0:
+        stem = text.split(':')[1].split(';')[0]
+    if text.count('МФА') > 0:
+        ipa = text.split(';')[3].split('&')[0]
+    if text.count('тип склонения') > 0:
+        zkod = text.split('тип склонения')[1].strip().split(' ')[0].strip("^")
+
+
+if stem != '_' and zkod != '_' and ipa != '_':
+                print('%s\t%s\t%s' % (stem, zkod, ipa))
+                stem = '_'
+                zkod = '_'
+                ipa = '_'
\ No newline at end of file

From f710b9996ee80520e1332717a4b4fd503423bcdd Mon Sep 17 00:00:00 2001
From: A1exRey <alexrey2@yandex.ru>
Date: Tue, 26 Mar 2019 15:48:15 +0300
Subject: [PATCH 10/16] HW 4 Pletenev

---
 .../Unigram_part_of_speech_tagger_response.md | 68 +++++++++++++++++++
 1 file changed, 68 insertions(+)
 create mode 100644 2018-komp-ling/practicals/Unigram part-of-speech tagger/Unigram_part_of_speech_tagger_response.md

diff --git a/2018-komp-ling/practicals/Unigram part-of-speech tagger/Unigram_part_of_speech_tagger_response.md b/2018-komp-ling/practicals/Unigram part-of-speech tagger/Unigram_part_of_speech_tagger_response.md
new file mode 100644
index 00000000..b370d769
--- /dev/null
+++ b/2018-komp-ling/practicals/Unigram part-of-speech tagger/Unigram_part_of_speech_tagger_response.md	
@@ -0,0 +1,68 @@
+# matplotlib
+
+Получение рангов для нашего текста
+
+````
+import matplotlib.pyplot as plt
+
+freq = []
+ranks = []
+
+#load data
+with open('./../freq.txt', 'r') as f:
+    f = f.readlines()
+for line in f:
+    line = line.strip('\n')
+    (f, w) = line.split('\t')
+    freq.append((int(f), w))
+
+freq.sort(reverse=True)
+
+#ranking data
+rank = 1
+min = freq[0][0]
+for i in range(0, len(freq)): 
+    if freq[i][0] < min: 
+        rank +=1
+        min = freq[i][0]
+    ranks.append([rank, freq[i][0], freq[i][1]])
+    
+#do the plots
+x = []
+y = []
+for line in ranks:
+    row = line
+    x.append(int(row[0]))
+    y.append(int(row[1]))
+plt.plot(x, y, 'b*')
+plt.show()
+````
+# ElementTree
+
+### How would you get just the Icelandic line and the gloss line ? 
+
+````
+for tier in root.findall('.//tier'):
+    if tier.attrib['id'] == 'n':
+        for item in tier.findall('.//item'):
+            if item.attrib['tag'] != 'T':   # here is the condition
+                print(item.text)
+````
+
+# scikit learn
+
+### Perceptron answers
+````
+- #хоругвь# incorrect class: 0 correct class: 1
+- #обувь# incorrect class: 0 correct class: 1
+- #морковь# incorrect class: 0 correct class: 1
+- #бровь# incorrect class: 0 correct class: 1
+- #церковь# incorrect class: 0 correct class: 1
+0.982857142857142856
+````
+To improve the quiality of our model we should use MLP, or deeper (than 1 layer) models 
+
+# Screenscraping
+
+done in __screencap.py__
+

From 77ea44d19436979f076fd70f16b283954e9d45e5 Mon Sep 17 00:00:00 2001
From: A1exRey <alexrey2@yandex.ru>
Date: Tue, 26 Mar 2019 15:49:09 +0300
Subject: [PATCH 11/16] HW 4 Pletenev

---
 .../screencap.py                              | 43 +++++++++++++++++++
 1 file changed, 43 insertions(+)
 create mode 100644 2018-komp-ling/practicals/Unigram part-of-speech tagger/screencap.py

diff --git a/2018-komp-ling/practicals/Unigram part-of-speech tagger/screencap.py b/2018-komp-ling/practicals/Unigram part-of-speech tagger/screencap.py
new file mode 100644
index 00000000..9421d8c5
--- /dev/null
+++ b/2018-komp-ling/practicals/Unigram part-of-speech tagger/screencap.py	
@@ -0,0 +1,43 @@
+#дерево
+import sys
+
+def strip_html(h):
+    output = ''
+    inTag = False
+    for c in h: 
+        if c == '<':
+            inTag = True
+            continue
+        if c == '>':
+            inTag = False
+            continue
+        if not inTag:
+            output += c
+    return output
+
+stem = '_'
+zkod = '_'
+ipa = '_'
+
+
+h1 = '_'
+for line in sys.stdin.readlines():
+    line = line.strip()
+    text = strip_html(line)
+    if line.count('<h1>') > 0: 
+        h1 = strip_html(line)
+    if h1 != 'Русский': 
+        continue  
+    if text.count('Корень:') > 0:
+        stem = text.split(':')[1].split(';')[0]
+    if text.count('МФА') > 0:
+        ipa = text.split(';')[3].split('&')[0]
+    if text.count('тип склонения') > 0:
+        zkod = text.split('тип склонения')[1].strip().split(' ')[0].strip("^")
+
+
+if stem != '_' and zkod != '_' and ipa != '_':
+                print('%s\t%s\t%s' % (stem, zkod, ipa))
+                stem = '_'
+                zkod = '_'
+                ipa = '_'

From 2fc793abd219c50da8735131c4b4d51ed8d3867c Mon Sep 17 00:00:00 2001
From: A1exRey <alexrey2@yandex.ru>
Date: Tue, 26 Mar 2019 15:50:26 +0300
Subject: [PATCH 12/16] Delete unigram.md

---
 .../practicals/unigra_tagger/unigram.md       | 68 -------------------
 1 file changed, 68 deletions(-)
 delete mode 100644 2018-komp-ling/practicals/unigra_tagger/unigram.md

diff --git a/2018-komp-ling/practicals/unigra_tagger/unigram.md b/2018-komp-ling/practicals/unigra_tagger/unigram.md
deleted file mode 100644
index b370d769..00000000
--- a/2018-komp-ling/practicals/unigra_tagger/unigram.md
+++ /dev/null
@@ -1,68 +0,0 @@
-# matplotlib
-
-Получение рангов для нашего текста
-
-````
-import matplotlib.pyplot as plt
-
-freq = []
-ranks = []
-
-#load data
-with open('./../freq.txt', 'r') as f:
-    f = f.readlines()
-for line in f:
-    line = line.strip('\n')
-    (f, w) = line.split('\t')
-    freq.append((int(f), w))
-
-freq.sort(reverse=True)
-
-#ranking data
-rank = 1
-min = freq[0][0]
-for i in range(0, len(freq)): 
-    if freq[i][0] < min: 
-        rank +=1
-        min = freq[i][0]
-    ranks.append([rank, freq[i][0], freq[i][1]])
-    
-#do the plots
-x = []
-y = []
-for line in ranks:
-    row = line
-    x.append(int(row[0]))
-    y.append(int(row[1]))
-plt.plot(x, y, 'b*')
-plt.show()
-````
-# ElementTree
-
-### How would you get just the Icelandic line and the gloss line ? 
-
-````
-for tier in root.findall('.//tier'):
-    if tier.attrib['id'] == 'n':
-        for item in tier.findall('.//item'):
-            if item.attrib['tag'] != 'T':   # here is the condition
-                print(item.text)
-````
-
-# scikit learn
-
-### Perceptron answers
-````
-- #хоругвь# incorrect class: 0 correct class: 1
-- #обувь# incorrect class: 0 correct class: 1
-- #морковь# incorrect class: 0 correct class: 1
-- #бровь# incorrect class: 0 correct class: 1
-- #церковь# incorrect class: 0 correct class: 1
-0.982857142857142856
-````
-To improve the quiality of our model we should use MLP, or deeper (than 1 layer) models 
-
-# Screenscraping
-
-done in __screencap.py__
-

From b8217766d34f7febba442fbe671c55af43967648 Mon Sep 17 00:00:00 2001
From: A1exRey <alexrey2@yandex.ru>
Date: Tue, 26 Mar 2019 15:50:36 +0300
Subject: [PATCH 13/16] Delete screencap.py

---
 .../practicals/unigra_tagger/screencap.py     | 43 -------------------
 1 file changed, 43 deletions(-)
 delete mode 100644 2018-komp-ling/practicals/unigra_tagger/screencap.py

diff --git a/2018-komp-ling/practicals/unigra_tagger/screencap.py b/2018-komp-ling/practicals/unigra_tagger/screencap.py
deleted file mode 100644
index 61ff9c6d..00000000
--- a/2018-komp-ling/practicals/unigra_tagger/screencap.py
+++ /dev/null
@@ -1,43 +0,0 @@
-#дерево
-import sys
-
-def strip_html(h):
-    output = ''
-    inTag = False
-    for c in h: 
-        if c == '<':
-            inTag = True
-            continue
-        if c == '>':
-            inTag = False
-            continue
-        if not inTag:
-            output += c
-    return output
-
-stem = '_'
-zkod = '_'
-ipa = '_'
-
-
-h1 = '_'
-for line in sys.stdin.readlines():
-    line = line.strip()
-    text = strip_html(line)
-    if line.count('<h1>') > 0: 
-        h1 = strip_html(line)
-    if h1 != 'Русский': 
-        continue  
-    if text.count('Корень:') > 0:
-        stem = text.split(':')[1].split(';')[0]
-    if text.count('МФА') > 0:
-        ipa = text.split(';')[3].split('&')[0]
-    if text.count('тип склонения') > 0:
-        zkod = text.split('тип склонения')[1].strip().split(' ')[0].strip("^")
-
-
-if stem != '_' and zkod != '_' and ipa != '_':
-                print('%s\t%s\t%s' % (stem, zkod, ipa))
-                stem = '_'
-                zkod = '_'
-                ipa = '_'
\ No newline at end of file

From 4d9f5874f9e3b85450f0a1b6aff8feb504df9559 Mon Sep 17 00:00:00 2001
From: A1exRey <alexrey2@yandex.ru>
Date: Fri, 29 Mar 2019 21:56:39 +0300
Subject: [PATCH 14/16] HW 5 Pletenev

---
 .../xrenner_practical/xrenner-response.md     | 34 +++++++++++++++++++
 1 file changed, 34 insertions(+)
 create mode 100644 2018-komp-ling/practicals/xrenner_practical/xrenner-response.md

diff --git a/2018-komp-ling/practicals/xrenner_practical/xrenner-response.md b/2018-komp-ling/practicals/xrenner_practical/xrenner-response.md
new file mode 100644
index 00000000..4534f8de
--- /dev/null
+++ b/2018-komp-ling/practicals/xrenner_practical/xrenner-response.md
@@ -0,0 +1,34 @@
+# Xrenner Response
+
+So, the first - we must make all preparation, like install the xrenner etc:
+The standart model for english language works just fine:
+
+    $ python3 xrenner.py -m eng -o html example_in.conll10 > /mnt/c/sub_wsl/example.html
+  
+Next, we must make our language model (in this case - russian). We can make this model from scratch, or copy a meta-language folder:
+
+    $ cp -R ./models/udx ./models/rus
+
+### Rules
+
+Add some rules to our model.
+
+__pronouns.tab:__
+
+    я       1sg 
+    мы      1pl 
+    он      male
+    она     fema
+    его     male
+    её      fema
+    меня    1sg 
+    нас     1pl 
+
+__coref.tab:__
+
+    Рабиндранат Тагор|Тагор       coref
+    
+This simple rules give us a great results (see: pushkin.html):
+  
+    $ python3 xrenner.py -m rus -o html pushkin.conllu > /mnt/c/sub_wsl/pushkin.html
+    

From 74eddc6573e58e9a94a7ae5c6b4a533dadc1437c Mon Sep 17 00:00:00 2001
From: A1exRey <alexrey2@yandex.ru>
Date: Fri, 29 Mar 2019 21:57:33 +0300
Subject: [PATCH 15/16] HW 5 Pletenev

---
 .../practicals/xrenner_practical/example.html | 540 ++++++++++++++++++
 .../practicals/xrenner_practical/pushkin.html | 133 +++++
 2 files changed, 673 insertions(+)
 create mode 100644 2018-komp-ling/practicals/xrenner_practical/example.html
 create mode 100644 2018-komp-ling/practicals/xrenner_practical/pushkin.html

diff --git a/2018-komp-ling/practicals/xrenner_practical/example.html b/2018-komp-ling/practicals/xrenner_practical/example.html
new file mode 100644
index 00000000..b3d25489
--- /dev/null
+++ b/2018-komp-ling/practicals/xrenner_practical/example.html
@@ -0,0 +1,540 @@
+<html>
+<head>
+	<link rel="stylesheet" href="http://corpling.uis.georgetown.edu/xrenner/css/renner.css" type="text/css" charset="utf-8"/>
+	<link rel="stylesheet" href="https://corpling.uis.georgetown.edu/xrenner/css/font-awesome-4.2.0/css/font-awesome.min.css"/>
+	<meta http-equiv="content-type" content="text/html; charset=utf-8"/>
+</head>
+<body>
+<script src="http://corpling.uis.georgetown.edu/xrenner/script/jquery-1.11.3.min.js"></script>
+<script src="http://corpling.uis.georgetown.edu/xrenner/script/chroma.min.js"></script>
+<script src="http://corpling.uis.georgetown.edu/xrenner/script/xrenner.js"></script>
+<div id="referent_2" head="2" onmouseover="highlight_group('2')" onmouseout="unhighlight_group('2')" class="referent" group="2" title="class: place | subclass: country&#10;definiteness: def | agree: inanim&#10;cardinality: 0 | form: proper&#10;func: nsubj&#10;core_text: New Zealand | lemma: Zealand"><span class="entity_type"><i title="place" class="fa fa-map-marker"></i></span>
+New
+Zealand
+</div>
+begins
+process
+to
+consider
+changing
+national
+flag
+design
+<div id="referent_6" head="11" onmouseover="highlight_group('6')" onmouseout="unhighlight_group('6')" class="referent" group="6" title="class: time | subclass: time&#10;definiteness: def | agree: &#10;cardinality: 0 | form: proper&#10;func: root&#10;core_text: Thursday | lemma: Thursday"><span class="entity_type"><i title="time" class="fa fa-clock-o"></i></span>
+Thursday
+</div>
+,
+<div id="referent_8" head="14" onmouseover="highlight_group('6')" onmouseout="unhighlight_group('6')" class="referent" group="6" title="class: time | subclass: time&#10;definiteness: indef | agree: &#10;cardinality: 0 | form: common&#10;func: appos&#10;core_text: May 7 , 2015 | lemma: 7&#10;coref_type: appos&#10;coref_rule: 9" antecedent="referent_6"><span class="entity_type"><i title="time" class="fa fa-clock-o"></i></span>
+May
+7
+,
+2015
+</div>
+On
+Tuesday
+,
+<div id="referent_12" head="23" onmouseover="highlight_group('12')" onmouseout="unhighlight_group('12')" class="referent" group="12" title="class: organization | subclass: organization&#10;definiteness: def | agree: inanim&#10;cardinality: 0 | form: common&#10;func: nsubj&#10;core_text: New Zealand government | lemma: government"><span class="entity_type"><i title="organization" class="fa fa-bank"></i></span>
+the
+<div id="referent_11" head="22" onmouseover="highlight_group('2')" onmouseout="unhighlight_group('2')" class="referent" group="2" title="class: place | subclass: country&#10;definiteness: def | agree: inanim&#10;cardinality: 0 | form: proper&#10;func: nn&#10;core_text: New Zealand | lemma: Zealand&#10;coref_type: coref&#10;coref_rule: 1" antecedent="referent_2"><span class="entity_type"><i title="place" class="fa fa-map-marker"></i></span>
+New
+Zealand
+</div>
+government
+</div>
+announced
+the
+start
+of
+<div id="referent_14" head="30" onmouseover="highlight_group('14')" onmouseout="unhighlight_group('14')" class="referent" group="14" title="class: abstract | subclass: abstract&#10;definiteness: indef | agree: inanim&#10;cardinality: 0 | form: common&#10;func: prep_of&#10;core_text: public process to suggest designs for a new national flag , and determine whether their citizens would prefer a different national flag over the current one | lemma: process"><span class="entity_type"><i title="abstract" class="fa fa-cloud"></i></span>
+a
+public
+process
+to
+suggest
+<div id="referent_15" head="33" onmouseover="highlight_group('15')" onmouseout="unhighlight_group('15')" class="referent" group="15" title="class: abstract | subclass: abstract&#10;definiteness: indef | agree: plural&#10;cardinality: 0 | form: common&#10;func: dobj&#10;core_text: designs for a new national flag | lemma: design"><span class="entity_type"><i title="abstract" class="fa fa-cloud"></i></span>
+designs
+for
+<div id="referent_16" head="38" onmouseover="highlight_group('16')" onmouseout="unhighlight_group('16')" class="referent" group="16" title="class: object | subclass: object&#10;definiteness: indef | agree: inanim&#10;cardinality: 0 | form: common&#10;func: prep_for&#10;core_text: new national flag | lemma: flag"><span class="entity_type"><i title="object" class="fa fa-cube"></i></span>
+a
+new
+national
+flag
+</div>
+</div>
+,
+and
+determine
+whether
+<div id="referent_17" head="43" onmouseover="highlight_group('15')" onmouseout="unhighlight_group('15')" class="referent" group="15" title="class: abstract | subclass: abstract&#10;definiteness: def | agree: plural&#10;cardinality: 0 | form: pronoun&#10;func: poss&#10;core_text:  | lemma: their&#10;coref_type: ana&#10;coref_rule: 24" antecedent="referent_15"><span class="entity_type"><i title="abstract" class="fa fa-cloud"></i></span>
+their
+</div>
+citizens
+would
+prefer
+a
+different
+national
+flag
+over
+<div id="referent_20" head="54" onmouseover="highlight_group('20')" onmouseout="unhighlight_group('20')" class="referent" group="20" title="class: object | subclass: object&#10;definiteness: def | agree: inanim&#10;cardinality: 0 | form: common&#10;func: prep_over&#10;core_text: current one | lemma: one"><span class="entity_type"><i title="object" class="fa fa-cube"></i></span>
+the
+current
+one
+</div>
+</div>
+.
+<div id="referent_21" head="58" onmouseover="highlight_group('20')" onmouseout="unhighlight_group('20')" class="referent" group="20" title="class: object | subclass: object&#10;definiteness: def | agree: inanim&#10;cardinality: 0 | form: common&#10;func: root&#10;core_text: current flag of New Zealand | lemma: flag&#10;coref_type: coref&#10;coref_rule: 14" antecedent="referent_20"><span class="entity_type"><i title="object" class="fa fa-cube"></i></span>
+The
+current
+flag
+of
+<div id="referent_22" head="61" onmouseover="highlight_group('2')" onmouseout="unhighlight_group('2')" class="referent" group="2" title="class: place | subclass: country&#10;definiteness: def | agree: inanim&#10;cardinality: 0 | form: proper&#10;func: prep_of&#10;core_text: New Zealand | lemma: Zealand&#10;coref_type: coref&#10;coref_rule: 1" antecedent="referent_11"><span class="entity_type"><i title="place" class="fa fa-map-marker"></i></span>
+New
+Zealand
+</div>
+</div>
+.
+<div id="referent_24" head="67" onmouseover="highlight_group('20')" onmouseout="unhighlight_group('20')" class="referent" group="20" title="class: object | subclass: object&#10;definiteness: def | agree: inanim&#10;cardinality: 0 | form: common&#10;func: nsubjpass&#10;core_text: current New Zealand flag | lemma: flag&#10;coref_type: coref&#10;coref_rule: 14" antecedent="referent_21"><span class="entity_type"><i title="object" class="fa fa-cube"></i></span>
+The
+current
+<div id="referent_23" head="66" onmouseover="highlight_group('2')" onmouseout="unhighlight_group('2')" class="referent" group="2" title="class: place | subclass: country&#10;definiteness: def | agree: inanim&#10;cardinality: 0 | form: proper&#10;func: nn&#10;core_text: New Zealand | lemma: Zealand&#10;coref_type: coref&#10;coref_rule: 1" antecedent="referent_22"><span class="entity_type"><i title="place" class="fa fa-map-marker"></i></span>
+New
+Zealand
+</div>
+flag
+</div>
+is
+partially
+based
+on
+<div id="referent_26" head="76" onmouseover="highlight_group('16')" onmouseout="unhighlight_group('16')" class="referent" group="16" title="class: object | subclass: object&#10;definiteness: def | agree: inanim&#10;cardinality: 0 | form: common&#10;func: prep_on&#10;core_text: United Kingdom 's flag | lemma: flag&#10;coref_type: coref&#10;coref_rule: 27" antecedent="referent_16"><span class="entity_type"><i title="object" class="fa fa-cube"></i></span>
+the
+United
+Kingdom
+'s
+flag
+</div>
+;
+<div id="referent_27" head="80" onmouseover="highlight_group('16')" onmouseout="unhighlight_group('16')" class="referent" group="16" title="class: object | subclass: object&#10;definiteness: def | agree: inanim&#10;cardinality: 0 | form: common&#10;func: nsubj&#10;core_text: new one | lemma: one&#10;coref_type: coref&#10;coref_rule: 13" antecedent="referent_26"><span class="entity_type"><i title="object" class="fa fa-cube"></i></span>
+the
+new
+one
+</div>
+would
+be
+unique
+to
+<div id="referent_28" head="86" onmouseover="highlight_group('2')" onmouseout="unhighlight_group('2')" class="referent" group="2" title="class: place | subclass: country&#10;definiteness: def | agree: inanim&#10;cardinality: 0 | form: proper&#10;func: prep_to&#10;core_text: New Zealand | lemma: Zealand&#10;coref_type: coref&#10;coref_rule: 1" antecedent="referent_23"><span class="entity_type"><i title="place" class="fa fa-map-marker"></i></span>
+New
+Zealand
+</div>
+.
+<div id="referent_32" head="93" onmouseover="highlight_group('32')" onmouseout="unhighlight_group('32')" class="referent" group="32" title="class: abstract | subclass: abstract&#10;definiteness: def | agree: inanim&#10;cardinality: 0 | form: proper&#10;func: nsubj&#10;core_text: government 's Flag Consideration Project | lemma: project"><span class="entity_type"><i title="abstract" class="fa fa-cloud"></i></span>
+The
+government
+'s
+<div id="referent_30" head="91" onmouseover="highlight_group('20')" onmouseout="unhighlight_group('20')" class="referent" group="20" title="class: object | subclass: object&#10;definiteness: def | agree: inanim&#10;cardinality: 0 | form: proper&#10;func: nn&#10;core_text: Flag | lemma: flag&#10;coref_type: coref&#10;coref_rule: 27" antecedent="referent_24"><span class="entity_type"><i title="object" class="fa fa-cube"></i></span>
+Flag
+</div>
+Consideration
+Project
+</div>
+has
+planned
+a
+number
+of
+conferences
+and
+roadshows
+as
+part
+of
+<div id="referent_36" head="106" onmouseover="highlight_group('14')" onmouseout="unhighlight_group('14')" class="referent" group="14" title="class: abstract | subclass: abstract&#10;definiteness: def | agree: inanim&#10;cardinality: 0 | form: common&#10;func: prep_of&#10;core_text: process | lemma: process&#10;coref_type: coref&#10;coref_rule: 27" antecedent="referent_14"><span class="entity_type"><i title="abstract" class="fa fa-cloud"></i></span>
+this
+process
+</div>
+,
+with
+the
+first
+meeting
+set
+to
+take
+place
+in
+Christchurch
+on
+May
+16
+.
+According
+to
+the
+<div id="referent_40" head="126" onmouseover="highlight_group('2')" onmouseout="unhighlight_group('2')" class="referent" group="2" title="class: place | subclass: country&#10;definiteness: def | agree: inanim&#10;cardinality: 0 | form: proper&#10;func: nn&#10;core_text: New Zealand | lemma: Zealand&#10;coref_type: coref&#10;coref_rule: 1" antecedent="referent_28"><span class="entity_type"><i title="place" class="fa fa-map-marker"></i></span>
+New
+Zealand
+</div>
+Herald
+,
+<div id="env" head="132" onmouseover="highlight_group('42')" onmouseout="unhighlight_group('42')" class="referent" group="42" title="class: person | subclass: person&#10;definiteness: def | agree: male&#10;cardinality: 0 | form: proper&#10;func: nsubj&#10;core_text: Emeritus Professor John Burrows , the chairman of the project 's panel of twelve | lemma: Burrows"><span class="entity_type"><i title="person" class="fa fa-male"></i></span>
+<div id="referent_42" head="132" onmouseover="highlight_group('1084')" onmouseout="unhighlight_group('1084')" class="referent" group="1084" title="class: person | subclass: person&#10;definiteness: def | agree: male&#10;cardinality: 0 | form: proper&#10;func: nsubj&#10;core_text: Emeritus Professor John Burrows | lemma: Burrows"><span class="entity_type"><i title="person" class="fa fa-male"></i></span>
+Emeritus
+Professor
+John
+Burrows
+</div>
+,
+<div id="referent_43" head="135" onmouseover="highlight_group('1084')" onmouseout="unhighlight_group('1084')" class="referent" group="1084" title="class: person | subclass: person&#10;definiteness: def | agree: &#10;cardinality: 0 | form: common&#10;func: appos&#10;core_text: chairman of the project 's panel of twelve | lemma: chairman&#10;coref_type: appos&#10;coref_rule: 9" antecedent="referent_42"><span class="entity_type"><i title="person" class="fa fa-male"></i></span>
+the
+chairman
+of
+the
+<div id="referent_44" head="138" onmouseover="highlight_group('32')" onmouseout="unhighlight_group('32')" class="referent" group="32" title="class: abstract | subclass: abstract&#10;definiteness: def | agree: inanim&#10;cardinality: 0 | form: common&#10;func: poss&#10;core_text: project | lemma: project&#10;coref_type: coref&#10;coref_rule: 27" antecedent="referent_32"><span class="entity_type"><i title="abstract" class="fa fa-cloud"></i></span>
+project
+</div>
+'s
+panel
+of
+twelve
+</div>
+,
+</div>
+said
+<div id="referent_47" head="146" onmouseover="highlight_group('2')" onmouseout="unhighlight_group('2')" class="referent" group="2" title="class: place | subclass: country&#10;definiteness: def | agree: inanim&#10;cardinality: 0 | form: proper&#10;func: poss&#10;core_text: New Zealand | lemma: Zealand&#10;coref_type: coref&#10;coref_rule: 1" antecedent="referent_40"><span class="entity_type"><i title="place" class="fa fa-map-marker"></i></span>
+New
+Zealand
+</div>
+'s
+flag
+has
+never
+before
+been
+open
+to
+public
+choice
+.
+<div id="referent_50" head="159" onmouseover="highlight_group('42')" onmouseout="unhighlight_group('42')" class="referent" group="42" title="class: person | subclass: person&#10;definiteness: def | agree: male&#10;cardinality: 0 | form: proper&#10;func: nsubj&#10;core_text: Professor Burrows | lemma: Burrows&#10;coref_type: coref&#10;coref_rule: 25" antecedent="env"><span class="entity_type"><i title="person" class="fa fa-male"></i></span>
+Professor
+Burrows
+</div>
+also
+said
+resources
+and
+kits
+would
+be
+accessible
+for
+schools
+and
+communities
+,
+"
+For
+example
+,
+<div id="referent_57" head="177" onmouseover="highlight_group('57')" onmouseout="unhighlight_group('57')" class="referent" group="57" title="class: organization | subclass: organization&#10;definiteness: indef | agree: plural&#10;cardinality: 0 | form: common&#10;func: nsubj&#10;core_text: schools | lemma: school"><span class="entity_type"><i title="organization" class="fa fa-bank"></i></span>
+schools
+</div>
+can
+run
+<div id="referent_58" head="180" onmouseover="highlight_group('57')" onmouseout="unhighlight_group('57')" class="referent" group="57" title="class: organization | subclass: organization&#10;definiteness: def | agree: plural&#10;cardinality: 0 | form: pronoun&#10;func: poss&#10;core_text:  | lemma: their&#10;coref_type: ana&#10;coref_rule: 24" antecedent="referent_57"><span class="entity_type"><i title="organization" class="fa fa-bank"></i></span>
+their
+</div>
+own
+flag
+discussions
+and
+referendums
+to
+mirror
+the
+formal
+process
+as
+part
+of
+<div id="referent_64" head="194" onmouseover="highlight_group('57')" onmouseout="unhighlight_group('57')" class="referent" group="57" title="class: organization | subclass: organization&#10;definiteness: def | agree: plural&#10;cardinality: 0 | form: pronoun&#10;func: poss&#10;core_text:  | lemma: their&#10;coref_type: ana&#10;coref_rule: 11" antecedent="referent_58"><span class="entity_type"><i title="organization" class="fa fa-bank"></i></span>
+their
+</div>
+own
+learning
+exercise
+"
+.
+<div id="referent_67" head="200" onmouseover="highlight_group('67')" onmouseout="unhighlight_group('67')" class="referent" group="67" title="class: person | subclass: person&#10;definiteness: indef | agree: plural&#10;cardinality: 0 | form: common&#10;func: nsubj&#10;core_text: People | lemma: people"><span class="entity_type"><i title="person" class="fa fa-male"></i></span>
+People
+</div>
+were
+encouraged
+to
+submit
+<div id="referent_68" head="205" onmouseover="highlight_group('67')" onmouseout="unhighlight_group('67')" class="referent" group="67" title="class: person | subclass: person&#10;definiteness: def | agree: plural&#10;cardinality: 0 | form: pronoun&#10;func: poss&#10;core_text:  | lemma: their&#10;coref_type: ana&#10;coref_rule: 24" antecedent="referent_67"><span class="entity_type"><i title="person" class="fa fa-male"></i></span>
+their
+</div>
+designs
+online
+at
+www.flag.govt.nz
+and
+suggest
+what
+<div id="referent_71" head="214" onmouseover="highlight_group('16')" onmouseout="unhighlight_group('16')" class="referent" group="16" title="class: object | subclass: object&#10;definiteness: def | agree: inanim&#10;cardinality: 0 | form: common&#10;func: nsubj&#10;core_text: flag | lemma: flag&#10;coref_type: coref&#10;coref_rule: 27" antecedent="referent_27"><span class="entity_type"><i title="object" class="fa fa-cube"></i></span>
+the
+flag
+</div>
+should
+mean
+on
+www.standfor.co.nz
+.
+<div id="referent_73" head="220" onmouseover="highlight_group('73')" onmouseout="unhighlight_group('73')" class="referent" group="73" title="class: abstract | subclass: abstract&#10;definiteness: indef | agree: plural&#10;cardinality: 0 | form: common&#10;func: nsubjpass&#10;core_text: Names of participants | lemma: name"><span class="entity_type"><i title="abstract" class="fa fa-cloud"></i></span>
+Names
+of
+participants
+</div>
+would
+be
+engraved
+,
+at
+<div id="referent_75" head="228" onmouseover="highlight_group('73')" onmouseout="unhighlight_group('73')" class="referent" group="73" title="class: abstract | subclass: abstract&#10;definiteness: def | agree: plural&#10;cardinality: 0 | form: pronoun&#10;func: poss&#10;core_text:  | lemma: their&#10;coref_type: ana&#10;coref_rule: 24" antecedent="referent_73"><span class="entity_type"><i title="abstract" class="fa fa-cloud"></i></span>
+their
+</div>
+option
+,
+on
+a
+flag
+pole
+monument
+to
+be
+built
+in
+<div id="referent_80" head="243" onmouseover="highlight_group('80')" onmouseout="unhighlight_group('80')" class="referent" group="80" title="class: place | subclass: town&#10;definiteness: def | agree: inanim&#10;cardinality: 0 | form: common&#10;func: prep_in&#10;core_text: nation 's capital | lemma: capital"><span class="entity_type"><i title="place" class="fa fa-map-marker"></i></span>
+the
+nation
+'s
+capital
+</div>
+,
+<div id="referent_81" head="245" onmouseover="highlight_group('80')" onmouseout="unhighlight_group('80')" class="referent" group="80" title="class: place | subclass: town&#10;definiteness: def | agree: inanim&#10;cardinality: 0 | form: proper&#10;func: appos&#10;core_text: Wellington | lemma: Wellington&#10;coref_type: appos&#10;coref_rule: 9" antecedent="referent_80"><span class="entity_type"><i title="place" class="fa fa-map-marker"></i></span>
+Wellington
+</div>
+.
+<div id="referent_82" head="253" onmouseover="highlight_group('82')" onmouseout="unhighlight_group('82')" class="referent" group="82" title="class: person | subclass: person&#10;definiteness: def | agree: male&#10;cardinality: 0 | form: proper&#10;func: nsubj&#10;core_text: New Zealand 's Prime Minister John Key | lemma: Key"><span class="entity_type"><i title="person" class="fa fa-male"></i></span>
+New
+Zealand
+'s
+Prime
+Minister
+John
+Key
+</div>
+said
+<div id="referent_83" head="255" onmouseover="highlight_group('82')" onmouseout="unhighlight_group('82')" class="referent" group="82" title="class: person | subclass: person&#10;definiteness: def | agree: male&#10;cardinality: 0 | form: pronoun&#10;func: nsubj&#10;core_text: he | lemma: he&#10;coref_type: ana&#10;coref_rule: 24" antecedent="referent_82"><span class="entity_type"><i title="person" class="fa fa-male"></i></span>
+he
+</div>
+believes
+redesigning
+<div id="referent_84" head="259" onmouseover="highlight_group('16')" onmouseout="unhighlight_group('16')" class="referent" group="16" title="class: object | subclass: object&#10;definiteness: def | agree: inanim&#10;cardinality: 0 | form: common&#10;func: nsubj&#10;core_text: flag | lemma: flag&#10;coref_type: coref&#10;coref_rule: 27" antecedent="referent_71"><span class="entity_type"><i title="object" class="fa fa-cube"></i></span>
+the
+flag
+</div>
+now
+has
+a
+"
+strong
+rationale
+"
+.
+<div id="referent_86" head="269" onmouseover="highlight_group('82')" onmouseout="unhighlight_group('82')" class="referent" group="82" title="class: person | subclass: person&#10;definiteness: def | agree: male&#10;cardinality: 0 | form: proper&#10;func: nsubj&#10;core_text: Mr Key | lemma: Key&#10;coref_type: coref&#10;coref_rule: 25" antecedent="referent_83"><span class="entity_type"><i title="person" class="fa fa-male"></i></span>
+Mr
+Key
+</div>
+promoted
+the
+campaign
+for
+a
+unique
+<div id="referent_88" head="277" onmouseover="highlight_group('2')" onmouseout="unhighlight_group('2')" class="referent" group="2" title="class: place | subclass: country&#10;definiteness: def | agree: inanim&#10;cardinality: 0 | form: proper&#10;func: nn&#10;core_text: New Zealand | lemma: Zealand&#10;coref_type: coref&#10;coref_rule: 1" antecedent="referent_47"><span class="entity_type"><i title="place" class="fa fa-map-marker"></i></span>
+New
+Zealand
+</div>
+flag
+on
+<div id="referent_91" head="281" onmouseover="highlight_group('91')" onmouseout="unhighlight_group('91')" class="referent" group="91" title="class: time | subclass: time&#10;definiteness: def | agree: inanim&#10;cardinality: 0 | form: proper&#10;func: prep_on&#10;core_text: Waitangi Day | lemma: Day"><span class="entity_type"><i title="time" class="fa fa-clock-o"></i></span>
+Waitangi
+Day
+</div>
+-
+<div id="referent_93" head="284" onmouseover="highlight_group('91')" onmouseout="unhighlight_group('91')" class="referent" group="91" title="class: time | subclass: time&#10;definiteness: indef | agree: inanim&#10;cardinality: 0 | form: common&#10;func: appos&#10;core_text: February 6 | lemma: 6&#10;coref_type: appos&#10;coref_rule: 9" antecedent="referent_91"><span class="entity_type"><i title="time" class="fa fa-clock-o"></i></span>
+February
+6
+</div>
+-
+this
+year
+.
+Of
+the
+public
+process
+,
+<div id="referent_96" head="294" onmouseover="highlight_group('82')" onmouseout="unhighlight_group('82')" class="referent" group="82" title="class: person | subclass: person&#10;definiteness: def | agree: male&#10;cardinality: 0 | form: pronoun&#10;func: nsubj&#10;core_text: he | lemma: he&#10;coref_type: ana&#10;coref_rule: 24" antecedent="referent_86"><span class="entity_type"><i title="person" class="fa fa-male"></i></span>
+he
+</div>
+said
+,
+"
+In
+the
+end
+<div id="referent_97" head="301" onmouseover="highlight_group('82')" onmouseout="unhighlight_group('82')" class="referent" group="82" title="class: person | subclass: person&#10;definiteness: def | agree: 1sg&#10;cardinality: 0 | form: pronoun&#10;func: nsubj&#10;core_text: I | lemma: I&#10;coref_type: ana&#10;coref_rule: 17" antecedent="referent_96"><span class="entity_type"><i title="person" class="fa fa-male"></i></span>
+I
+</div>
+'ll
+have
+one
+vote
+in
+each
+referendum
+just
+like
+every
+other
+New
+Zealander
+on
+the
+electoral
+roll
+"
+.
+<div id="referent_103" head="324" onmouseover="highlight_group('12')" onmouseout="unhighlight_group('12')" class="referent" group="12" title="class: organization | subclass: organization&#10;definiteness: def | agree: inanim&#10;cardinality: 0 | form: common&#10;func: nsubj&#10;core_text: New Zealand government | lemma: government&#10;coref_type: coref&#10;coref_rule: 27" antecedent="referent_12"><span class="entity_type"><i title="organization" class="fa fa-bank"></i></span>
+The
+<div id="referent_102" head="323" onmouseover="highlight_group('2')" onmouseout="unhighlight_group('2')" class="referent" group="2" title="class: place | subclass: country&#10;definiteness: def | agree: inanim&#10;cardinality: 0 | form: proper&#10;func: nn&#10;core_text: New Zealand | lemma: Zealand&#10;coref_type: coref&#10;coref_rule: 1" antecedent="referent_88"><span class="entity_type"><i title="place" class="fa fa-map-marker"></i></span>
+New
+Zealand
+</div>
+government
+</div>
+intends
+to
+hold
+two
+referendums
+to
+reach
+a
+verdict
+on
+<div id="referent_106" head="336" onmouseover="highlight_group('16')" onmouseout="unhighlight_group('16')" class="referent" group="16" title="class: object | subclass: object&#10;definiteness: def | agree: inanim&#10;cardinality: 0 | form: common&#10;func: prep_on&#10;core_text: flag | lemma: flag&#10;coref_type: coref&#10;coref_rule: 27" antecedent="referent_84"><span class="entity_type"><i title="object" class="fa fa-cube"></i></span>
+the
+flag
+</div>
+,
+at
+an
+estimated
+cost
+of
+NZ
+$
+26
+million
+,
+although
+a
+recent
+poll
+found
+only
+a
+quarter
+of
+citizens
+favoured
+changing
+<div id="referent_113" head="361" onmouseover="highlight_group('16')" onmouseout="unhighlight_group('16')" class="referent" group="16" title="class: object | subclass: object&#10;definiteness: def | agree: inanim&#10;cardinality: 0 | form: common&#10;func: dobj&#10;core_text: flag | lemma: flag&#10;coref_type: coref&#10;coref_rule: 27" antecedent="referent_106"><span class="entity_type"><i title="object" class="fa fa-cube"></i></span>
+the
+flag
+</div>
+.
+This
+is
+a
+decrease
+from
+the
+year
+before
+,
+when
+it
+was
+forty
+percent
+.
+The
+first
+referendum
+is
+to
+be
+held
+from
+November
+20
+to
+December
+11
+,
+selecting
+a
+single
+new
+flag
+design
+out
+of
+about
+four
+finalists
+.
+<div id="referent_126" head="404" onmouseover="highlight_group('126')" onmouseout="unhighlight_group('126')" class="referent" group="126" title="class: person | subclass: person&#10;definiteness: indef | agree: plural&#10;cardinality: 0 | form: common&#10;func: nsubj&#10;core_text: Voters | lemma: voter"><span class="entity_type"><i title="person" class="fa fa-male"></i></span>
+Voters
+</div>
+would
+then
+choose
+between
+<div id="referent_127" head="411" onmouseover="highlight_group('16')" onmouseout="unhighlight_group('16')" class="referent" group="16" title="class: object | subclass: object&#10;definiteness: def | agree: inanim&#10;cardinality: 0 | form: common&#10;func: prep_between&#10;core_text: new flag | lemma: flag&#10;coref_type: coref&#10;coref_rule: 27" antecedent="referent_113"><span class="entity_type"><i title="object" class="fa fa-cube"></i></span>
+the
+new
+flag
+</div>
+and
+<div id="referent_130" head="415" onmouseover="highlight_group('20')" onmouseout="unhighlight_group('20')" class="referent" group="20" title="class: object | subclass: object&#10;definiteness: def | agree: inanim&#10;cardinality: 0 | form: common&#10;func: prep_between&#10;core_text: current flag | lemma: flag&#10;coref_type: coref&#10;coref_rule: 27" antecedent="referent_30"><span class="entity_type"><i title="object" class="fa fa-cube"></i></span>
+<div id="referent_129" head="413" onmouseover="highlight_group('126')" onmouseout="unhighlight_group('126')" class="referent" group="126" title="class: person | subclass: person&#10;definiteness: def | agree: plural&#10;cardinality: 0 | form: pronoun&#10;func: poss&#10;core_text:  | lemma: their&#10;coref_type: ana&#10;coref_rule: 24" antecedent="referent_126"><span class="entity_type"><i title="person" class="fa fa-male"></i></span>
+their
+</div>
+current
+flag
+</div>
+early
+in
+2016
+.
+<script>colorize();</script>
+</body>
+</html>
\ No newline at end of file
diff --git a/2018-komp-ling/practicals/xrenner_practical/pushkin.html b/2018-komp-ling/practicals/xrenner_practical/pushkin.html
new file mode 100644
index 00000000..c9fe846a
--- /dev/null
+++ b/2018-komp-ling/practicals/xrenner_practical/pushkin.html
@@ -0,0 +1,133 @@
+<html>
+<head>
+	<link rel="stylesheet" href="http://corpling.uis.georgetown.edu/xrenner/css/renner.css" type="text/css" charset="utf-8"/>
+	<link rel="stylesheet" href="https://corpling.uis.georgetown.edu/xrenner/css/font-awesome-4.2.0/css/font-awesome.min.css"/>
+	<meta http-equiv="content-type" content="text/html; charset=utf-8"/>
+</head>
+<body>
+<script src="http://corpling.uis.georgetown.edu/xrenner/script/jquery-1.11.3.min.js"></script>
+<script src="http://corpling.uis.georgetown.edu/xrenner/script/chroma.min.js"></script>
+<script src="http://corpling.uis.georgetown.edu/xrenner/script/xrenner.js"></script>
+Однажды
+<div id="referent_2" head="2" onmouseover="highlight_group('2')" onmouseout="unhighlight_group('2')" class="referent" group="2" title="class: person | subclass: person&#10;definiteness: indef | agree: Masc_Sing&#10;cardinality: 0 | form: common&#10;func: nsubj&#10;core_text: Пушкин | lemma: Пушкин"><span class="entity_type"><i title="person" class="fa fa-male"></i></span>
+Пушкин
+</div>
+написал
+<div id="referent_3" head="4" onmouseover="highlight_group('3')" onmouseout="unhighlight_group('3')" class="referent" group="3" title="class: object | subclass: object&#10;definiteness: indef | agree: Neut_Sing&#10;cardinality: 0 | form: common&#10;func: obj&#10;core_text: письмо | lemma: письмо"><span class="entity_type"><i title="object" class="fa fa-cube"></i></span>
+письмо
+</div>
+<div id="referent_5" head="6" onmouseover="highlight_group('5')" onmouseout="unhighlight_group('5')" class="referent" group="5" title="class: person | subclass: person&#10;definiteness: indef | agree: Masc_Sing&#10;cardinality: 0 | form: common&#10;func: obl&#10;core_text: Рабиндранату Тагору | lemma: Тагор"><span class="entity_type"><i title="person" class="fa fa-male"></i></span>
+<div id="referent_4" head="5" onmouseover="highlight_group('4')" onmouseout="unhighlight_group('4')" class="referent" group="4" title="class: abstract | subclass: abstract&#10;definiteness: indef | agree: Masc_Sing&#10;cardinality: 0 | form: common&#10;func: flat:name&#10;core_text: Рабиндранату | lemma: Рабиндранат"><span class="entity_type"><i title="abstract" class="fa fa-cloud"></i></span>
+Рабиндранату
+</div>
+Тагору
+</div>
+.
+"
+<div id="referent_6" head="11" onmouseover="highlight_group('6')" onmouseout="unhighlight_group('6')" class="referent" group="6" title="class: abstract | subclass: abstract&#10;definiteness: indef | agree: Masc_Sing&#10;cardinality: 0 | form: common&#10;func: parataxis&#10;core_text: Дорогой далекий друг | lemma: друг"><span class="entity_type"><i title="abstract" class="fa fa-cloud"></i></span>
+Дорогой
+далекий
+друг
+</div>
+,
+—
+писал
+<div id="referent_7" head="15" onmouseover="highlight_group('2')" onmouseout="unhighlight_group('2')" class="referent" group="2" title="class: person | subclass: person&#10;definiteness: def | agree: Masc_Sing&#10;cardinality: 0 | form: pronoun&#10;func: nsubj&#10;core_text: он | lemma: он&#10;coref_type: ana&#10;coref_rule: 5" antecedent="referent_2"><span class="entity_type"><i title="person" class="fa fa-male"></i></span>
+он
+</div>
+,
+—
+<div id="referent_8" head="18" onmouseover="highlight_group('8')" onmouseout="unhighlight_group('8')" class="referent" group="8" title="class: abstract | subclass: abstract&#10;definiteness: def | agree: Sing&#10;cardinality: 0 | form: pronoun&#10;func: nsubj&#10;core_text: я | lemma: я"><span class="entity_type"><i title="abstract" class="fa fa-cloud"></i></span>
+я
+</div>
+<div id="referent_9" head="19" onmouseover="highlight_group('9')" onmouseout="unhighlight_group('9')" class="referent" group="9" title="class: abstract | subclass: abstract&#10;definiteness: def | agree: Plur&#10;cardinality: 0 | form: pronoun&#10;func: obl&#10;core_text: Вас | lemma: вы"><span class="entity_type"><i title="abstract" class="fa fa-cloud"></i></span>
+Вас
+</div>
+не
+знаю
+,
+и
+<div id="referent_10" head="24" onmouseover="highlight_group('9')" onmouseout="unhighlight_group('9')" class="referent" group="9" title="class: abstract | subclass: abstract&#10;definiteness: def | agree: Plur&#10;cardinality: 0 | form: pronoun&#10;func: nsubj&#10;core_text: Вы | lemma: вы&#10;coref_type: ana&#10;coref_rule: 7" antecedent="referent_9"><span class="entity_type"><i title="abstract" class="fa fa-cloud"></i></span>
+Вы
+</div>
+<div id="referent_11" head="25" onmouseover="highlight_group('8')" onmouseout="unhighlight_group('8')" class="referent" group="8" title="class: abstract | subclass: abstract&#10;definiteness: def | agree: Sing&#10;cardinality: 0 | form: pronoun&#10;func: obj&#10;core_text: меня | lemma: я&#10;coref_type: ana&#10;coref_rule: 5" antecedent="referent_8"><span class="entity_type"><i title="abstract" class="fa fa-cloud"></i></span>
+меня
+</div>
+не
+знаете
+.
+Очень
+хотелось
+бы
+познакомиться
+.
+Всего
+хорошего
+.
+<div id="referent_12" head="37" onmouseover="highlight_group('12')" onmouseout="unhighlight_group('12')" class="referent" group="12" title="class: abstract | subclass: abstract&#10;definiteness: indef | agree: Masc_Sing&#10;cardinality: 0 | form: common&#10;func: root&#10;core_text: Саша | lemma: Саша"><span class="entity_type"><i title="abstract" class="fa fa-cloud"></i></span>
+Саша
+</div>
+"
+.
+Когда
+<div id="referent_13" head="41" onmouseover="highlight_group('13')" onmouseout="unhighlight_group('13')" class="referent" group="13" title="class: object | subclass: object&#10;definiteness: indef | agree: Neut_Sing&#10;cardinality: 0 | form: common&#10;func: obj&#10;core_text: письмо | lemma: письмо"><span class="entity_type"><i title="object" class="fa fa-cube"></i></span>
+письмо
+</div>
+принесли
+,
+<div id="referent_14" head="44" onmouseover="highlight_group('14')" onmouseout="unhighlight_group('14')" class="referent" group="14" title="class: person | subclass: person&#10;definiteness: indef | agree: Masc_Sing&#10;cardinality: 0 | form: common&#10;func: nsubj&#10;core_text: Тагор | lemma: Тагор"><span class="entity_type"><i title="person" class="fa fa-male"></i></span>
+Тагор
+</div>
+предавался
+<div id="referent_15" head="46" onmouseover="highlight_group('15')" onmouseout="unhighlight_group('15')" class="referent" group="15" title="class: abstract | subclass: abstract&#10;definiteness: indef | agree: Neut_Sing&#10;cardinality: 0 | form: common&#10;func: obl&#10;core_text: самосозерцанию | lemma: самосозерцание"><span class="entity_type"><i title="abstract" class="fa fa-cloud"></i></span>
+самосозерцанию
+</div>
+.
+Так
+погрузился
+,
+<div id="referent_16" head="52" onmouseover="highlight_group('16')" onmouseout="unhighlight_group('16')" class="referent" group="16" title="class: abstract | subclass: abstract&#10;definiteness: indef | agree: Fem_Sing&#10;cardinality: 0 | form: common&#10;func: nsubj&#10;core_text: хоть режь его | lemma: режь"><span class="entity_type"><i title="abstract" class="fa fa-cloud"></i></span>
+хоть
+режь
+<div id="referent_17" head="53" onmouseover="highlight_group('14')" onmouseout="unhighlight_group('14')" class="referent" group="14" title="class: person | subclass: person&#10;definiteness: def | agree: Masc_Sing&#10;cardinality: 0 | form: pronoun&#10;func: nmod&#10;core_text: его | lemma: он&#10;coref_type: ana&#10;coref_rule: 5" antecedent="referent_14"><span class="entity_type"><i title="person" class="fa fa-male"></i></span>
+его
+</div>
+</div>
+.
+<div id="referent_19" head="56" onmouseover="highlight_group('19')" onmouseout="unhighlight_group('19')" class="referent" group="19" title="class: person | subclass: person&#10;definiteness: indef | agree: female&#10;cardinality: 0 | form: common&#10;func: nsubj&#10;core_text: Его жена | lemma: жена"><span class="entity_type"><i title="person" class="fa fa-male"></i></span>
+<div id="referent_18" head="55" onmouseover="highlight_group('14')" onmouseout="unhighlight_group('14')" class="referent" group="14" title="class: abstract | subclass: abstract&#10;definiteness: def | agree: Masc_Sing&#10;cardinality: 0 | form: pronoun&#10;func: nmod:poss&#10;core_text: Его | lemma: он&#10;coref_type: ana&#10;coref_rule: 7" antecedent="referent_17"><span class="entity_type"><i title="abstract" class="fa fa-cloud"></i></span>
+Его
+</div>
+жена
+</div>
+толкала
+,
+толкала
+,
+<div id="referent_20" head="61" onmouseover="highlight_group('20')" onmouseout="unhighlight_group('20')" class="referent" group="20" title="class: object | subclass: object&#10;definiteness: indef | agree: Neut_Sing&#10;cardinality: 0 | form: common&#10;func: obj&#10;core_text: письмо | lemma: письмо"><span class="entity_type"><i title="object" class="fa fa-cube"></i></span>
+письмо
+</div>
+подсовывала
+—
+не
+видит
+.
+<div id="referent_21" head="67" onmouseover="highlight_group('14')" onmouseout="unhighlight_group('14')" class="referent" group="14" title="class: abstract | subclass: abstract&#10;definiteness: def | agree: Masc_Sing&#10;cardinality: 0 | form: pronoun&#10;func: nsubj&#10;core_text: Он | lemma: он&#10;coref_type: ana&#10;coref_rule: 7" antecedent="referent_18"><span class="entity_type"><i title="abstract" class="fa fa-cloud"></i></span>
+Он
+</div>
+,
+правда
+,
+по-русски
+читать
+не
+умел
+.
+Так
+и
+не
+познакомились
+.
+<script>colorize();</script>
+</body>
+</html>
\ No newline at end of file

From 48231e94313daa243c77c9f002688ef02e7b47b6 Mon Sep 17 00:00:00 2001
From: A1exRey <alexrey2@yandex.ru>
Date: Tue, 2 Apr 2019 14:58:50 +0300
Subject: [PATCH 16/16] Pletenev Sergey homeworks

---
 .../Unigram-part-of-speech-tagger-response.md | 68 +++++++++++++++++++
 .../practicals/segmentation-response.md       | 19 ++++++
 .../practicals/transliteration-response.md    | 42 ++++++++++++
 2018-komp-ling/practicals/xrenner-response.md | 34 ++++++++++
 4 files changed, 163 insertions(+)
 create mode 100644 2018-komp-ling/practicals/Unigram-part-of-speech-tagger-response.md
 create mode 100644 2018-komp-ling/practicals/segmentation-response.md
 create mode 100644 2018-komp-ling/practicals/transliteration-response.md
 create mode 100644 2018-komp-ling/practicals/xrenner-response.md

diff --git a/2018-komp-ling/practicals/Unigram-part-of-speech-tagger-response.md b/2018-komp-ling/practicals/Unigram-part-of-speech-tagger-response.md
new file mode 100644
index 00000000..b370d769
--- /dev/null
+++ b/2018-komp-ling/practicals/Unigram-part-of-speech-tagger-response.md
@@ -0,0 +1,68 @@
+# matplotlib
+
+Получение рангов для нашего текста
+
+````
+import matplotlib.pyplot as plt
+
+freq = []
+ranks = []
+
+#load data
+with open('./../freq.txt', 'r') as f:
+    f = f.readlines()
+for line in f:
+    line = line.strip('\n')
+    (f, w) = line.split('\t')
+    freq.append((int(f), w))
+
+freq.sort(reverse=True)
+
+#ranking data
+rank = 1
+min = freq[0][0]
+for i in range(0, len(freq)): 
+    if freq[i][0] < min: 
+        rank +=1
+        min = freq[i][0]
+    ranks.append([rank, freq[i][0], freq[i][1]])
+    
+#do the plots
+x = []
+y = []
+for line in ranks:
+    row = line
+    x.append(int(row[0]))
+    y.append(int(row[1]))
+plt.plot(x, y, 'b*')
+plt.show()
+````
+# ElementTree
+
+### How would you get just the Icelandic line and the gloss line ? 
+
+````
+for tier in root.findall('.//tier'):
+    if tier.attrib['id'] == 'n':
+        for item in tier.findall('.//item'):
+            if item.attrib['tag'] != 'T':   # here is the condition
+                print(item.text)
+````
+
+# scikit learn
+
+### Perceptron answers
+````
+- #хоругвь# incorrect class: 0 correct class: 1
+- #обувь# incorrect class: 0 correct class: 1
+- #морковь# incorrect class: 0 correct class: 1
+- #бровь# incorrect class: 0 correct class: 1
+- #церковь# incorrect class: 0 correct class: 1
+0.982857142857142856
+````
+To improve the quiality of our model we should use MLP, or deeper (than 1 layer) models 
+
+# Screenscraping
+
+done in __screencap.py__
+
diff --git a/2018-komp-ling/practicals/segmentation-response.md b/2018-komp-ling/practicals/segmentation-response.md
new file mode 100644
index 00000000..fa2dc210
--- /dev/null
+++ b/2018-komp-ling/practicals/segmentation-response.md
@@ -0,0 +1,19 @@
+<!--
+SPDX-License-Identifier: (CC-BY-SA-4.0 OR GFDL-1.3-or-later)
+Copyright 2018 Nick Howell
+-->
+
+<div style="column-width: 30em">
+
+<h1> Обзор двух библиотек по токенизированию предложений  </h1>
+
+В данном отчете было использованы две библиотеки по токенеизорванию спредложений из текста: pragmatic segmenter (Ruby) и NLTK (Python). В качестве тестового текста был использован кусок дампа русской Википедии.
+<h2> Pragmatic segmenter (Ruby) </h2>
+Pragmatic segmenter - это бибилотека для Ruby, основанная на правилах. При парсинге русской википедии данная библиотека показала качество ниже среднего. Большинство сокращений, инициалы имен и т.д. неправильно были разделены на предложения.
+В общем библиотека больше расчитана на языки латинского алфавита.
+
+<h2> NLTK (Python) </h2>
+
+sent_tokenize() - это функция библиотеки NLTK по определению границ предложения. Но на самом деле это алгоритм машинного обучения без учителя, который можно обучить самомстоятельно. В бибилотеке NLTK уже есть набор pre-trained моделей, в том числе и для русского языка. В общем данная библиотека показала себя лучше, чем Ruby. Большинство сокращений и инициалов выделены правильно, единтсвенную проблему составляет сокращения с пробелами внутри.
+
+</div>
diff --git a/2018-komp-ling/practicals/transliteration-response.md b/2018-komp-ling/practicals/transliteration-response.md
new file mode 100644
index 00000000..6532626d
--- /dev/null
+++ b/2018-komp-ling/practicals/transliteration-response.md
@@ -0,0 +1,42 @@
+# Practical 2: Transliteration (engineering)
+
+<div style="column-width: 30em">
+
+## Questions
+What to do with ambiguous letters ? For example, Cyrillic `е' could be either je or e.
+
+Can you think of a way that you could provide mappings from many characters to one character ?
+        For example sh → ш or дж → c ?
+How might you make different mapping rules for characters at the beginning or end of the string ?
+
+
+### Правила для транслитерации
+
+Основная идея - это начинать транслитерацию со сложных, многобуквенных преобразований (ч - tch). Например:
+>Шарик -- sh-арик -- sharik
+
+Далее нужно заменить все гласные в начале и в конце слова (Я - ya). 
+>яблоко -- ya-блоко -- yabloko
+
+После чего уже можно переходить на простые однобуквенные преобразования (у - u)
+>мед -- med
+
+## Методы
+### Кодировка-декодировка с помощью KOI-8R
+
+Транслитерация с помощью кодировки KOI-8R - не самый эффективный метод транслитерации текста. 
+Но он обеспечивает некоторые особенности, которые не доступны другим способам:
+
+    1)Возможность восстановить первоначальный текст
+
+    2)Правила кодировки уже заданы
+
+Метод транслитерации с помощью KOI-8R представлен в файле transliterate_koi8r.py
+
+### Кодировка-декодировка с помощью правил
+
+Правила для транслетерации находятся в файле rules.txt.
+ В нем заданы правила для согланых, гласных, а также для гласных в начале слова.
+
+
+</div>
diff --git a/2018-komp-ling/practicals/xrenner-response.md b/2018-komp-ling/practicals/xrenner-response.md
new file mode 100644
index 00000000..4534f8de
--- /dev/null
+++ b/2018-komp-ling/practicals/xrenner-response.md
@@ -0,0 +1,34 @@
+# Xrenner Response
+
+So, the first - we must make all preparation, like install the xrenner etc:
+The standart model for english language works just fine:
+
+    $ python3 xrenner.py -m eng -o html example_in.conll10 > /mnt/c/sub_wsl/example.html
+  
+Next, we must make our language model (in this case - russian). We can make this model from scratch, or copy a meta-language folder:
+
+    $ cp -R ./models/udx ./models/rus
+
+### Rules
+
+Add some rules to our model.
+
+__pronouns.tab:__
+
+    я       1sg 
+    мы      1pl 
+    он      male
+    она     fema
+    его     male
+    её      fema
+    меня    1sg 
+    нас     1pl 
+
+__coref.tab:__
+
+    Рабиндранат Тагор|Тагор       coref
+    
+This simple rules give us a great results (see: pushkin.html):
+  
+    $ python3 xrenner.py -m rus -o html pushkin.conllu > /mnt/c/sub_wsl/pushkin.html
+