Skip to content

Week 01 Laboratory (with solutions)

Objectives

  • Understanding regular expressions
  • Understanding use of UNIX filters (grep)

Preparation

Before the lab you should re-read the relevant lecture slides and their accompanying examples.

Getting Started

If you're not working at CSE, you can download the provided files as a zip file or a tar file.

Exercise 1: grep-ing a Dictionary

On most Unix systems you will find one or more dictionaries containing many thousands of words:
Typically in the directory /usr/share/dict/

sh
ls -1 /usr/share/dict/
american-english
american-english-huge
american-english-insane
american-english-large
american-english-small
british-english
british-english-huge
british-english-insane
british-english-large
british-english-small
cracklib-small
words -> /etc/dictionaries-common/words -> /usr/share/dict/american-english

We've created an example dictionary named dictionary.txt for this lab exercise.

Exercise 1.1

Write a grep -E command that prints the words which contain the characters "lmn" consecutively.

It should print:

almner
almners
calmness
calmnesses
Sample answer:
sh
grep -E 'lmn' dictionary.txt

Exercise 1.2

Write a grep -E command that prints the words which contain any four consecutive vowels.

It should print:

Output:
aqueous
archaeoastronomer
archaeoastronomers
archaeoastronomical
archaeoastronomies
archaeoastronomy
banlieue
beauish
blooie
cooee
cooeed
cooeeing
cooees
enqueue
epigaeous
epopoeia
epopoeias
euoi
euois
euouae
euouaes
flooie
giaour
giaours
gooier
gooiest
guaiac
guaiacol
guaiacols
guaiacs
guaiacum
guaiacums
guaiocum
guaiocums
hawaiians
homoiousian
homoiousians
hypoaeolian
hypogaeous
loaiasis
looie
looies
louie
louies
maieutic
maieutical
maieutics
meoued
meouing
metasequoia
metasequoias
miaou
miaoued
miaouing
miaous
mythopoeia
mythopoeias
nonaqueous
obloquious
obsequious
obsequiously
obsequiousness
obsequiousnesses
onomatopoeia
onomatopoeias
palaeoanthropic
palaeoanthropological
palaeoanthropologies
palaeoanthropologist
palaeoanthropologists
palaeoanthropology
palaeoecologic
palaeoecological
palaeoecologies
palaeoecologist
palaeoecologists
palaeoecology
palaeoethnologic
palaeoethnological
palaeoethnologist
palaeoethnologists
palaeoethnology
pharmacopoeia
pharmacopoeial
pharmacopoeian
pharmacopoeias
plateaued
plateauing
prosopopoeia
prosopopoeial
prosopopoeias
queue
queued
queueing
queueings
queuer
queuers
queues
queuings
radioautograph
radioautographic
radioautographies
radioautographs
radioautography
radioiodine
radioiodines
reliquiae
rhythmopoeia
saouari
saouaris
scarabaeoid
scarabaeoids
sequoia
sequoias
subaqueous
tenuious
terraqueous
zoaea
zooea
zooeae
zooeal
zooeas
zoogloeae
zoogloeoid
zooier
zooiest
Sample answer
sh
grep -E -i '[aeiou]{4}' dictionary.txt

Exercise 1.3

Write a grep -E command that prints the words which contain all 5 vowels "aeiou" in that order.

The words may contain more than 5 vowels but they must contain "aeiou" in that order.

It should print:

Output:
abstemious
abstemiously
abstemiousness
abstemiousnesses
abstentious
adenocarcinomatous
adventitious
adventitiously
adventitiousness
adventitiousnesses
aeruginous
amentiferous
androdioecious
andromonoecious
anemophilous
antenniferous
antireligious
arenicolous
argentiferous
arsenious
arteriovenous
asclepiadaceous
autoecious
autoeciously
bacteriophagous
caesalpiniaceous
caesious
cavernicolous
chaetiferous
facetious
facetiously
facetiousness
facetiousnesses
flagelliferous
garnetiferous
haemoglobinous
hamamelidaceous
lateritious
paroecious
quadrigeminous
sacrilegious
sacrilegiously
sacrilegiousness
sacrilegiousnesses
sarraceniaceous
supercalifragilisticexpialidocious
ultrareligious
ultraserious
valerianaceous
Sample answer:
sh
grep -E -i 'a.*e.*i.*o.*u' dictionary.txt

Exercise 1.4

Write a grep -E command that prints the words which contain the vowels "aeiou", in that order, and no other vowels.

It should print:

abstemious
abstemiously
abstentious
arsenious
caesious
facetious
facetiously
Sample answer:
sh
grep -E -i '^[^aeiou]*a[^aeiou]*e[^aeiou]*i[^aeiou]*o[^aeiou]*u[^aeiou]*$' dictionary.txt

Exercise 2: grep-ing Federal Parliament

You have been given a file named parliament_answers.txt.
Which you must use to enter the answers for this exercise.

The autotest scripts depend on the format of parliament_answers.txt.
So just add your answers where indicated but don't otherwise change the file.

sh
    Open a text editor (gedit) in the background (&) and not owned by the current terminal (disown)
    gedit parliament_answers.txt & disown
    Or use any other text editor of your choosing

In this exercise you will analyze a file named parliament.txt containing a list of the members of the Australian House of Representatives (MPs).

As we have just had an election the information in the file parliament.txt might not be up to date.

Exercise 2.1.

Write a grep -E command that will print all the lines in the file where the electorate begins with 'W'.

It should print:

Hon Scott Buchholz: Member for Wright, Queensland
Hon Tony Burke: Member for Watson, New South Wales
Hon Stephen Jones: Member for Whitlam, New South Wales
Mr Peter Khalil: Member for Wills, Victoria
Mr Llew O'Brien: Member for Wide Bay, Queensland
Ms Allegra Spender: Member for Wentworth, New South Wales
Ms Anne Stanley: Member for Werriwa, New South Wales
Ms Zali Steggall OAM: Member for Warringah, New South Wales
Hon Dan Tehan: Member for Wannon, Victoria
Sample answer:
sh
grep -E 'Member for W' parliament.txt

Exercise 2.2

Write a grep -E command that will print all the lines in the file where the MP's given name (first name) is "Andrew".

It should print:

Dr Andrew Charlton: Member for Parramatta, New South Wales
Hon Andrew Gee: Member for Calare, New South Wales
Hon Andrew Giles: Member for Scullin, Victoria
Hon Andrew Hastie: Member for Canning, Western Australia
Hon Dr Andrew Leigh: Member for Fenner, Australian Capital Territory
Mr Andrew Wallace: Member for Fisher, Queensland
Mr Andrew Wilkie: Member for Clark, Tasmania
Mr Andrew Willcox: Member for Dawson, Queensland
Sample answer:
sh
grep -E '^((Mr|Mrs|Ms|Dr|Hon) )*Andrew .*:' parliament.txt

Note this more obvious answer will also match middle names

sh
grep -E ' Andrew .*:' parliament.txt

Exercise 2.3

Write a grep -E command that will print all the lines in the file where the MP's surname (last name) ends in the letters 'll'.

It should print:

Ms Angie Bell: Member for Moncrieff, Queensland
Mr Sam Birrell: Member for Nicholls, Victoria
Mr Matt Burnell: Member for Spence, South Australia
Mr Julian Hill: Member for Bruce, Victoria
Mr Brian Mitchell: Member for Lyons, Tasmania
Mr Rob Mitchell: Member for McEwen, Victoria
Ms Zali Steggall OAM: Member for Warringah, New South Wales
Sample answer:
sh
grep -E 'll( [A-Z]*)?:' parliament.txt

Note this more obvious answer does not handle the MP having an Order of Australia

sh
grep -E 'll:' parliament.txt

Exercise 2.4

Write a grep -E command that will print all the lines in the file where the MP's surname (last name) and the electorate name ends in the letter 'y'.

It should print:

Ms Peta Murphy: Member for Dunkley, Victoria
Mr Rowan Ramsey: Member for Grey, South Australia
Sample answer:
sh
grep -E 'y( [A-Z]*)?:.*y,' parliament.txt

Note this more obvious answer does not handle the MP having an Order of Australia

sh
grep -E 'y:.*y,' parliament.txt

Exercise 2.5

Write a grep -E command that will print all the lines in the file where the MP's surname (last name) or the electorate name ends in the letter 'y'.

It should print:

Hon Dr Anne Aly: Member for Cowan, Western Australia
Hon Linda Burney: Member for Barton, New South Wales
Ms Kate Chaney: Member for Curtin, Western Australia
Hon Pat Conroy: Member for Shortland, New South Wales
Hon Milton Dick: Member for Oxley, Queensland
Hon Ed Husic: Member for Chifley, New South Wales
Hon Bob Katter: Member for Kennedy, Queensland
Hon Ged Kearney: Member for Cooper, Victoria
Hon Michelle Landry: Member for Capricornia, Queensland
Hon Sussan Ley: Member for Farrer, New South Wales
Mr Sam Lim: Member for Tangney, Western Australia
Mrs Melissa McIntosh: Member for Lindsay, New South Wales
Ms Louise Miller-Frost: Member for Boothby, South Australia
Ms Peta Murphy: Member for Dunkley, Victoria
Mr Llew O'Brien: Member for Wide Bay, Queensland
Hon Tanya Plibersek: Member for Sydney, New South Wales
Mr Rowan Ramsey: Member for Grey, South Australia
Hon Michelle Rowland: Member for Greenway, New South Wales
Ms Anne Stanley: Member for Werriwa, New South Wales
Ms Kylea Tink: Member for North Sydney, New South Wales
Mr Aaron Violi: Member for Casey, Victoria
Hon Anika Wells: Member for Lilley, Queensland
Sample answer:
grep -E 'y( [A-Z]*)?:|y,' parliament.txt

Note this more obvious answer does not handle the MP having an Order of Australia

grep -E 'y[:,]' parliament.txt

Exercise 2.6

Write a grep -E command that will print all the lines in the file where there is any word in the MP's name or the electorate name that ends in "ng".

It should print:

Mr Luke Gosling OAM: Member for Solomon, Northern Territory
Hon Andrew Hastie: Member for Canning, Western Australia
Hon Catherine King: Member for Ballarat, Victoria
Hon Madeleine King: Member for Brand, Western Australia
Mr Jerome Laxale: Member for Bennelong, New South Wales
Dr Monique Ryan: Member for Kooyong, Victoria
Hon Bill Shorten: Member for Maribyrnong, Victoria
Mr Terry Young: Member for Longman, Queensland
Sample answer:
grep -E 'ng[^a-z]' parliament.txt

Exercise 2.7

Write a grep -E command that will print all the lines in the file where the MP's surname (last name) both begins and ends with a vowel.

It should print:

Hon Anthony Albanese: Member for Grayndler, New South Wales
Sample answer:
grep -E '[AEIOU][^ ]*[aeiou]( [A-Z]*)?:' parliament.txt

Exercise 2.8

Write a grep -E command that will print all the lines in the file where the electorate name contains multiple words (separated by spaces or hyphens).

It should print:

Hon Barnaby Joyce: Member for New England, New South Wales
Hon Kristy McBain: Member for Eden-Monaro, New South Wales
Mr Llew O'Brien: Member for Wide Bay, Queensland
Hon Matt Thistlethwaite: Member for Kingsford Smith, New South Wales
Ms Kylea Tink: Member for North Sydney, New South Wales
Hon Jason Wood: Member for La Trobe, Victoria
Sample answer:
sh
grep -E 'Member for [a-zA-Z]+[ -][a-zA-Z]' parliament.txt