n.2 - Sed challenge: join cal -y months into a single column
date: 2021-11-16
This is a little sed challenge that ocurred to me somewhat recently, when I
wondered how many weeks were between 2 dates. Of course, I just looked at the
calendar and counted, but then I wondered what kind of sed script I’d need in
the middle of a cal -y | ... | wc -l
pipeline to get the same answer.
Seems like a good idea for an initial post to get a better feel for working with org-mode source blocks. Org-mode has the ability to print the result of a block in the file, then use that as input when evaluating another source block. That seems useful for posts.
Well, let’s start.
cal -y
Result:
2021
January February March
Su Mo Tu We Th Fr Sa Su Mo Tu We Th Fr Sa Su Mo Tu We Th Fr Sa
1 2 1 2 3 4 5 6 1 2 3 4 5 6
3 4 5 6 7 8 9 7 8 9 10 11 12 13 7 8 9 10 11 12 13
10 11 12 13 14 15 16 14 15 16 17 18 19 20 14 15 16 17 18 19 20
17 18 19 20 21 22 23 21 22 23 24 25 26 27 21 22 23 24 25 26 27
24 25 26 27 28 29 30 28 28 29 30 31
31
April May June
Su Mo Tu We Th Fr Sa Su Mo Tu We Th Fr Sa Su Mo Tu We Th Fr Sa
1 2 3 1 1 2 3 4 5
4 5 6 7 8 9 10 2 3 4 5 6 7 8 6 7 8 9 10 11 12
11 12 13 14 15 16 17 9 10 11 12 13 14 15 13 14 15 16 17 18 19
18 19 20 21 22 23 24 16 17 18 19 20 21 22 20 21 22 23 24 25 26
25 26 27 28 29 30 23 24 25 26 27 28 29 27 28 29 30
30 31
July August September
Su Mo Tu We Th Fr Sa Su Mo Tu We Th Fr Sa Su Mo Tu We Th Fr Sa
1 2 3 1 2 3 4 5 6 7 1 2 3 4
4 5 6 7 8 9 10 8 9 10 11 12 13 14 5 6 7 8 9 10 11
11 12 13 14 15 16 17 15 16 17 18 19 20 21 12 13 14 15 16 17 18
18 19 20 21 22 23 24 22 23 24 25 26 27 28 19 20 21 22 23 24 25
25 26 27 28 29 30 31 29 30 31 26 27 28 29 30
October November December
Su Mo Tu We Th Fr Sa Su Mo Tu We Th Fr Sa Su Mo Tu We Th Fr Sa
1 2 1 2 3 4 5 6 1 2 3 4
3 4 5 6 7 8 9 7 8 9 10 11 12 13 5 6 7 8 9 10 11
10 11 12 13 14 15 16 14 15 16 17 18 19 20 12 13 14 15 16 17 18
17 18 19 20 21 22 23 21 22 23 24 25 26 27 19 20 21 22 23 24 25
24 25 26 27 28 29 30 28 29 30 26 27 28 29 30 31
31
First, let’s change the month columns to interlace one another.
sed -E '1,2d;s/(.{20}) /\1\n/g;'
Result:
January
February
March
Su Mo Tu We Th Fr Sa
Su Mo Tu We Th Fr Sa
Su Mo Tu We Th Fr Sa
1 2
1 2 3 4 5 6
1 2 3 4 5 6
3 4 5 6 7 8 9
7 8 9 10 11 12 13
7 8 9 10 11 12 13
10 11 12 13 14 15 16
14 15 16 17 18 19 20
14 15 16 17 18 19 20
17 18 19 20 21 22 23
21 22 23 24 25 26 27
21 22 23 24 25 26 27
24 25 26 27 28 29 30
28
28 29 30 31
31
April
May
June
Su Mo Tu We Th Fr Sa
Su Mo Tu We Th Fr Sa
Su Mo Tu We Th Fr Sa
1 2 3
1
1 2 3 4 5
4 5 6 7 8 9 10
2 3 4 5 6 7 8
6 7 8 9 10 11 12
11 12 13 14 15 16 17
9 10 11 12 13 14 15
13 14 15 16 17 18 19
18 19 20 21 22 23 24
16 17 18 19 20 21 22
20 21 22 23 24 25 26
25 26 27 28 29 30
23 24 25 26 27 28 29
27 28 29 30
30 31
July
August
September
Su Mo Tu We Th Fr Sa
Su Mo Tu We Th Fr Sa
Su Mo Tu We Th Fr Sa
1 2 3
1 2 3 4 5 6 7
1 2 3 4
4 5 6 7 8 9 10
8 9 10 11 12 13 14
5 6 7 8 9 10 11
11 12 13 14 15 16 17
15 16 17 18 19 20 21
12 13 14 15 16 17 18
18 19 20 21 22 23 24
22 23 24 25 26 27 28
19 20 21 22 23 24 25
25 26 27 28 29 30 31
29 30 31
26 27 28 29 30
October
November
December
Su Mo Tu We Th Fr Sa
Su Mo Tu We Th Fr Sa
Su Mo Tu We Th Fr Sa
1 2
1 2 3 4 5 6
1 2 3 4
3 4 5 6 7 8 9
7 8 9 10 11 12 13
5 6 7 8 9 10 11
10 11 12 13 14 15 16
14 15 16 17 18 19 20
12 13 14 15 16 17 18
17 18 19 20 21 22 23
21 22 23 24 25 26 27
19 20 21 22 23 24 25
24 25 26 27 28 29 30
28 29 30
26 27 28 29 30 31
31
That allows the column to be selected with m~n addresses. I’ll use the hold-space to hold the lines for the middle and right columns until it’s time to print them.
sed -E '
# Middle column months
2~3 {
# Month-trios are 24 lines = 3 month columns * (monthline + weeklabel + max
# 6-week coverage of a month)
2~24h
2~24! {
H
# Move line to where it belongs at the middle, using the right-month
# name as an anchor to find the end of the middle month.
#
# For when there are less than 3 lines in the hold space, nothing needs
# to be done. It"s only from the third line onwards that they must be
# moved. This means that we can count on there being at least 2 newlines.
x
s/\n( +[A-Z][^\n]+.*)\n([^\n]*)$/\n\2\n\1/
x
}
}
# Right column months
3~3 {H}
$b get_held_months # no next trio, prepare to print held months
# Only print on lines of left month (and at EOF via jump on previous line).
1~3!d
# New month-trio
/^ +[A-Z]/ {
1b # Skip for first line
# Append, so the months held end up effectively prepended when they"re
# gotten.
H
: get_held_months
# Get the sorted middle and right month lines to output them.
g
}
'
Result:
January
Su Mo Tu We Th Fr Sa
1 2
3 4 5 6 7 8 9
10 11 12 13 14 15 16
17 18 19 20 21 22 23
24 25 26 27 28 29 30
31
February
Su Mo Tu We Th Fr Sa
1 2 3 4 5 6
7 8 9 10 11 12 13
14 15 16 17 18 19 20
21 22 23 24 25 26 27
28
March
Su Mo Tu We Th Fr Sa
1 2 3 4 5 6
7 8 9 10 11 12 13
14 15 16 17 18 19 20
21 22 23 24 25 26 27
28 29 30 31
April
Su Mo Tu We Th Fr Sa
1 2 3
4 5 6 7 8 9 10
11 12 13 14 15 16 17
18 19 20 21 22 23 24
25 26 27 28 29 30
May
Su Mo Tu We Th Fr Sa
1
2 3 4 5 6 7 8
9 10 11 12 13 14 15
16 17 18 19 20 21 22
23 24 25 26 27 28 29
30 31
June
Su Mo Tu We Th Fr Sa
1 2 3 4 5
6 7 8 9 10 11 12
13 14 15 16 17 18 19
20 21 22 23 24 25 26
27 28 29 30
July
Su Mo Tu We Th Fr Sa
1 2 3
4 5 6 7 8 9 10
11 12 13 14 15 16 17
18 19 20 21 22 23 24
25 26 27 28 29 30 31
August
Su Mo Tu We Th Fr Sa
1 2 3 4 5 6 7
8 9 10 11 12 13 14
15 16 17 18 19 20 21
22 23 24 25 26 27 28
29 30 31
September
Su Mo Tu We Th Fr Sa
1 2 3 4
5 6 7 8 9 10 11
12 13 14 15 16 17 18
19 20 21 22 23 24 25
26 27 28 29 30
October
Su Mo Tu We Th Fr Sa
1 2
3 4 5 6 7 8 9
10 11 12 13 14 15 16
17 18 19 20 21 22 23
24 25 26 27 28 29 30
31
November
Su Mo Tu We Th Fr Sa
1 2 3 4 5 6
7 8 9 10 11 12 13
14 15 16 17 18 19 20
21 22 23 24 25 26 27
28 29 30
December
Su Mo Tu We Th Fr Sa
1 2 3 4
5 6 7 8 9 10 11
12 13 14 15 16 17 18
19 20 21 22 23 24 25
26 27 28 29 30 31
Nice. Now, to be able to count weeks by piping into wc -l
, the start and end
of each month will have to be joined.
sed -E '
# Save the month name
/^ +[A-Z]/{s/ //g;h}
# Delete everything but the date lines
/^ *[1-9]/!d
# If on the week with the first of the month, add the month name to the right.
/\b1\b/{G;s/\n/ /}
'
Result:
1 2 January
3 4 5 6 7 8 9
10 11 12 13 14 15 16
17 18 19 20 21 22 23
24 25 26 27 28 29 30
31
1 2 3 4 5 6 February
7 8 9 10 11 12 13
14 15 16 17 18 19 20
21 22 23 24 25 26 27
28
1 2 3 4 5 6 March
7 8 9 10 11 12 13
14 15 16 17 18 19 20
21 22 23 24 25 26 27
28 29 30 31
1 2 3 April
4 5 6 7 8 9 10
11 12 13 14 15 16 17
18 19 20 21 22 23 24
25 26 27 28 29 30
1 May
2 3 4 5 6 7 8
9 10 11 12 13 14 15
16 17 18 19 20 21 22
23 24 25 26 27 28 29
30 31
1 2 3 4 5 June
6 7 8 9 10 11 12
13 14 15 16 17 18 19
20 21 22 23 24 25 26
27 28 29 30
1 2 3 July
4 5 6 7 8 9 10
11 12 13 14 15 16 17
18 19 20 21 22 23 24
25 26 27 28 29 30 31
1 2 3 4 5 6 7 August
8 9 10 11 12 13 14
15 16 17 18 19 20 21
22 23 24 25 26 27 28
29 30 31
1 2 3 4 September
5 6 7 8 9 10 11
12 13 14 15 16 17 18
19 20 21 22 23 24 25
26 27 28 29 30
1 2 October
3 4 5 6 7 8 9
10 11 12 13 14 15 16
17 18 19 20 21 22 23
24 25 26 27 28 29 30
31
1 2 3 4 5 6 November
7 8 9 10 11 12 13
14 15 16 17 18 19 20
21 22 23 24 25 26 27
28 29 30
1 2 3 4 December
5 6 7 8 9 10 11
12 13 14 15 16 17 18
19 20 21 22 23 24 25
26 27 28 29 30 31
Almost there.
sed -E '
# Not 7 days this week?
/[0-9]+(\s+[0-9]+){6}/!{
# and isn"t the first week of the first month?
1b
# We must be at the end of the month. Bring the start of the next month.
N
# Join the start of the next to this end.
s/ +\n +/ /
}
'
Result:
1 2 January
3 4 5 6 7 8 9
10 11 12 13 14 15 16
17 18 19 20 21 22 23
24 25 26 27 28 29 30
31 1 2 3 4 5 6 February
7 8 9 10 11 12 13
14 15 16 17 18 19 20
21 22 23 24 25 26 27
28 1 2 3 4 5 6 March
7 8 9 10 11 12 13
14 15 16 17 18 19 20
21 22 23 24 25 26 27
28 29 30 31 1 2 3 April
4 5 6 7 8 9 10
11 12 13 14 15 16 17
18 19 20 21 22 23 24
25 26 27 28 29 30 1 May
2 3 4 5 6 7 8
9 10 11 12 13 14 15
16 17 18 19 20 21 22
23 24 25 26 27 28 29
30 31 1 2 3 4 5 June
6 7 8 9 10 11 12
13 14 15 16 17 18 19
20 21 22 23 24 25 26
27 28 29 30 1 2 3 July
4 5 6 7 8 9 10
11 12 13 14 15 16 17
18 19 20 21 22 23 24
25 26 27 28 29 30 31
1 2 3 4 5 6 7 August
8 9 10 11 12 13 14
15 16 17 18 19 20 21
22 23 24 25 26 27 28
29 30 31 1 2 3 4 September
5 6 7 8 9 10 11
12 13 14 15 16 17 18
19 20 21 22 23 24 25
26 27 28 29 30 1 2 October
3 4 5 6 7 8 9
10 11 12 13 14 15 16
17 18 19 20 21 22 23
24 25 26 27 28 29 30
31 1 2 3 4 5 6 November
7 8 9 10 11 12 13
14 15 16 17 18 19 20
21 22 23 24 25 26 27
28 29 30 1 2 3 4 December
5 6 7 8 9 10 11
12 13 14 15 16 17 18
19 20 21 22 23 24 25
26 27 28 29 30 31
There we are. Not a perfect format, but good enough depending on the dates we’re curious about.
awk '
$8 { m = $8 }
m == "July" && /\<7\>/, m == "August" && /20/
'
Result:
4 5 6 7 8 9 10
11 12 13 14 15 16 17
18 19 20 21 22 23 24
25 26 27 28 29 30 31
1 2 3 4 5 6 7 August
8 9 10 11 12 13 14
15 16 17 18 19 20 21
And at long last:
wc -l
Result:
6
Crap. :(
That 6 should be a 7. That seems to be a bug in org-mode. It strips the
terminating newline of the previous block’s output for use as a variable in the
next block, like how a shell’s "$()"
expansion would, but it fails to add it
when passing to the stdin of a block like how a shell’s <<<
redirection would.
Since the last piece of text doesn’t properly terminate with a newline, wc -l
doesn’t consider it to be a line.
Oh well. At least I learned a bit about working with code from org-mode with this.
EDIT: For the sake of completeness, here’s the equivalent sed to the awk bit I did at the end:
sed -E '
# Hold the month name
/[a-z]$/{h;s/.* //;x}
# Append it when month is missing
/[a-z]$/!G
# Delete all lines not between July 7th and August 20th
/\b7\b.*July/,/20.*August/!d
# Delete appended month
s/\n.*//
'
Result:
4 5 6 7 8 9 10
11 12 13 14 15 16 17
18 19 20 21 22 23 24
25 26 27 28 29 30 31
1 2 3 4 5 6 7 August
8 9 10 11 12 13 14
15 16 17 18 19 20 21