n.2 - Sed challenge: join cal -y months into a single column

date: 2021-11-16

This is a little sed challenge that ocurred to me somewhat recently, when I wondered how many weeks were between 2 dates. Of course, I just looked at the calendar and counted, but then I wondered what kind of sed script I’d need in the middle of a cal -y | ... | wc -l pipeline to get the same answer.

Seems like a good idea for an initial post to get a better feel for working with org-mode source blocks. Org-mode has the ability to print the result of a block in the file, then use that as input when evaluating another source block. That seems useful for posts.

Well, let’s start.

cal -y

Result:

                               2021                               

       January               February                 March       
Su Mo Tu We Th Fr Sa   Su Mo Tu We Th Fr Sa   Su Mo Tu We Th Fr Sa
                1  2       1  2  3  4  5  6       1  2  3  4  5  6
 3  4  5  6  7  8  9    7  8  9 10 11 12 13    7  8  9 10 11 12 13
10 11 12 13 14 15 16   14 15 16 17 18 19 20   14 15 16 17 18 19 20
17 18 19 20 21 22 23   21 22 23 24 25 26 27   21 22 23 24 25 26 27
24 25 26 27 28 29 30   28                     28 29 30 31         
31                                                                
        April                   May                   June        
Su Mo Tu We Th Fr Sa   Su Mo Tu We Th Fr Sa   Su Mo Tu We Th Fr Sa
             1  2  3                      1          1  2  3  4  5
 4  5  6  7  8  9 10    2  3  4  5  6  7  8    6  7  8  9 10 11 12
11 12 13 14 15 16 17    9 10 11 12 13 14 15   13 14 15 16 17 18 19
18 19 20 21 22 23 24   16 17 18 19 20 21 22   20 21 22 23 24 25 26
25 26 27 28 29 30      23 24 25 26 27 28 29   27 28 29 30         
                       30 31                                      
        July                  August                September     
Su Mo Tu We Th Fr Sa   Su Mo Tu We Th Fr Sa   Su Mo Tu We Th Fr Sa
             1  2  3    1  2  3  4  5  6  7             1  2  3  4
 4  5  6  7  8  9 10    8  9 10 11 12 13 14    5  6  7  8  9 10 11
11 12 13 14 15 16 17   15 16 17 18 19 20 21   12 13 14 15 16 17 18
18 19 20 21 22 23 24   22 23 24 25 26 27 28   19 20 21 22 23 24 25
25 26 27 28 29 30 31   29 30 31               26 27 28 29 30      
                                                                  
       October               November               December      
Su Mo Tu We Th Fr Sa   Su Mo Tu We Th Fr Sa   Su Mo Tu We Th Fr Sa
                1  2       1  2  3  4  5  6             1  2  3  4
 3  4  5  6  7  8  9    7  8  9 10 11 12 13    5  6  7  8  9 10 11
10 11 12 13 14 15 16   14 15 16 17 18 19 20   12 13 14 15 16 17 18
17 18 19 20 21 22 23   21 22 23 24 25 26 27   19 20 21 22 23 24 25
24 25 26 27 28 29 30   28 29 30               26 27 28 29 30 31   
31                                                                

First, let’s change the month columns to interlace one another.

sed -E '1,2d;s/(.{20})   /\1\n/g;'

Result:

       January      
      February      
        March       
Su Mo Tu We Th Fr Sa
Su Mo Tu We Th Fr Sa
Su Mo Tu We Th Fr Sa
                1  2
    1  2  3  4  5  6
    1  2  3  4  5  6
 3  4  5  6  7  8  9
 7  8  9 10 11 12 13
 7  8  9 10 11 12 13
10 11 12 13 14 15 16
14 15 16 17 18 19 20
14 15 16 17 18 19 20
17 18 19 20 21 22 23
21 22 23 24 25 26 27
21 22 23 24 25 26 27
24 25 26 27 28 29 30
28                  
28 29 30 31         
31                  
                    
                    
        April       
         May        
        June        
Su Mo Tu We Th Fr Sa
Su Mo Tu We Th Fr Sa
Su Mo Tu We Th Fr Sa
             1  2  3
                   1
       1  2  3  4  5
 4  5  6  7  8  9 10
 2  3  4  5  6  7  8
 6  7  8  9 10 11 12
11 12 13 14 15 16 17
 9 10 11 12 13 14 15
13 14 15 16 17 18 19
18 19 20 21 22 23 24
16 17 18 19 20 21 22
20 21 22 23 24 25 26
25 26 27 28 29 30   
23 24 25 26 27 28 29
27 28 29 30         
                    
30 31               
                    
        July        
       August       
      September     
Su Mo Tu We Th Fr Sa
Su Mo Tu We Th Fr Sa
Su Mo Tu We Th Fr Sa
             1  2  3
 1  2  3  4  5  6  7
          1  2  3  4
 4  5  6  7  8  9 10
 8  9 10 11 12 13 14
 5  6  7  8  9 10 11
11 12 13 14 15 16 17
15 16 17 18 19 20 21
12 13 14 15 16 17 18
18 19 20 21 22 23 24
22 23 24 25 26 27 28
19 20 21 22 23 24 25
25 26 27 28 29 30 31
29 30 31            
26 27 28 29 30      
                    
                    
                    
       October      
      November      
      December      
Su Mo Tu We Th Fr Sa
Su Mo Tu We Th Fr Sa
Su Mo Tu We Th Fr Sa
                1  2
    1  2  3  4  5  6
          1  2  3  4
 3  4  5  6  7  8  9
 7  8  9 10 11 12 13
 5  6  7  8  9 10 11
10 11 12 13 14 15 16
14 15 16 17 18 19 20
12 13 14 15 16 17 18
17 18 19 20 21 22 23
21 22 23 24 25 26 27
19 20 21 22 23 24 25
24 25 26 27 28 29 30
28 29 30            
26 27 28 29 30 31   
31                  
                    
                    

That allows the column to be selected with m~n addresses. I’ll use the hold-space to hold the lines for the middle and right columns until it’s time to print them.

sed -E '
  # Middle column months
  2~3 {
    # Month-trios are 24 lines = 3 month columns * (monthline + weeklabel + max
    # 6-week coverage of a month)
    2~24h
    2~24! {
      H

      # Move line to where it belongs at the middle, using the right-month
      # name as an anchor to find the end of the middle month.
      #
      # For when there are less than 3 lines in the hold space, nothing needs
      # to be done. It"s only from the third line onwards that they must be
      # moved. This means that we can count on there being at least 2 newlines.
      x
      s/\n( +[A-Z][^\n]+.*)\n([^\n]*)$/\n\2\n\1/
      x
    }
  }

  # Right column months
  3~3 {H}

  $b get_held_months # no next trio, prepare to print held months

  # Only print on lines of left month (and at EOF via jump on previous line).
  1~3!d

  # New month-trio
  /^ +[A-Z]/ {
    1b # Skip for first line

    # Append, so the months held end up effectively prepended when they"re
    # gotten.
    H

    : get_held_months

    # Get the sorted middle and right month lines to output them.
    g
  }
'

Result:

       January      
Su Mo Tu We Th Fr Sa
                1  2
 3  4  5  6  7  8  9
10 11 12 13 14 15 16
17 18 19 20 21 22 23
24 25 26 27 28 29 30
31                  
      February      
Su Mo Tu We Th Fr Sa
    1  2  3  4  5  6
 7  8  9 10 11 12 13
14 15 16 17 18 19 20
21 22 23 24 25 26 27
28                  
                    
        March       
Su Mo Tu We Th Fr Sa
    1  2  3  4  5  6
 7  8  9 10 11 12 13
14 15 16 17 18 19 20
21 22 23 24 25 26 27
28 29 30 31         
                    
        April       
Su Mo Tu We Th Fr Sa
             1  2  3
 4  5  6  7  8  9 10
11 12 13 14 15 16 17
18 19 20 21 22 23 24
25 26 27 28 29 30   
                    
         May        
Su Mo Tu We Th Fr Sa
                   1
 2  3  4  5  6  7  8
 9 10 11 12 13 14 15
16 17 18 19 20 21 22
23 24 25 26 27 28 29
30 31               
        June        
Su Mo Tu We Th Fr Sa
       1  2  3  4  5
 6  7  8  9 10 11 12
13 14 15 16 17 18 19
20 21 22 23 24 25 26
27 28 29 30         
                    
        July        
Su Mo Tu We Th Fr Sa
             1  2  3
 4  5  6  7  8  9 10
11 12 13 14 15 16 17
18 19 20 21 22 23 24
25 26 27 28 29 30 31
                    
       August       
Su Mo Tu We Th Fr Sa
 1  2  3  4  5  6  7
 8  9 10 11 12 13 14
15 16 17 18 19 20 21
22 23 24 25 26 27 28
29 30 31            
                    
      September     
Su Mo Tu We Th Fr Sa
          1  2  3  4
 5  6  7  8  9 10 11
12 13 14 15 16 17 18
19 20 21 22 23 24 25
26 27 28 29 30      
                    
       October      
Su Mo Tu We Th Fr Sa
                1  2
 3  4  5  6  7  8  9
10 11 12 13 14 15 16
17 18 19 20 21 22 23
24 25 26 27 28 29 30
31                  
      November      
Su Mo Tu We Th Fr Sa
    1  2  3  4  5  6
 7  8  9 10 11 12 13
14 15 16 17 18 19 20
21 22 23 24 25 26 27
28 29 30            
                    
      December      
Su Mo Tu We Th Fr Sa
          1  2  3  4
 5  6  7  8  9 10 11
12 13 14 15 16 17 18
19 20 21 22 23 24 25
26 27 28 29 30 31   
                    

Nice. Now, to be able to count weeks by piping into wc -l, the start and end of each month will have to be joined.

sed -E '
  # Save the month name
  /^ +[A-Z]/{s/ //g;h}

  # Delete everything but the date lines
  /^ *[1-9]/!d

  # If on the week with the first of the month, add the month name to the right.
  /\b1\b/{G;s/\n/ /}
'

Result:

                1  2 January
 3  4  5  6  7  8  9
10 11 12 13 14 15 16
17 18 19 20 21 22 23
24 25 26 27 28 29 30
31                  
    1  2  3  4  5  6 February
 7  8  9 10 11 12 13
14 15 16 17 18 19 20
21 22 23 24 25 26 27
28                  
    1  2  3  4  5  6 March
 7  8  9 10 11 12 13
14 15 16 17 18 19 20
21 22 23 24 25 26 27
28 29 30 31         
             1  2  3 April
 4  5  6  7  8  9 10
11 12 13 14 15 16 17
18 19 20 21 22 23 24
25 26 27 28 29 30   
                   1 May
 2  3  4  5  6  7  8
 9 10 11 12 13 14 15
16 17 18 19 20 21 22
23 24 25 26 27 28 29
30 31               
       1  2  3  4  5 June
 6  7  8  9 10 11 12
13 14 15 16 17 18 19
20 21 22 23 24 25 26
27 28 29 30         
             1  2  3 July
 4  5  6  7  8  9 10
11 12 13 14 15 16 17
18 19 20 21 22 23 24
25 26 27 28 29 30 31
 1  2  3  4  5  6  7 August
 8  9 10 11 12 13 14
15 16 17 18 19 20 21
22 23 24 25 26 27 28
29 30 31            
          1  2  3  4 September
 5  6  7  8  9 10 11
12 13 14 15 16 17 18
19 20 21 22 23 24 25
26 27 28 29 30      
                1  2 October
 3  4  5  6  7  8  9
10 11 12 13 14 15 16
17 18 19 20 21 22 23
24 25 26 27 28 29 30
31                  
    1  2  3  4  5  6 November
 7  8  9 10 11 12 13
14 15 16 17 18 19 20
21 22 23 24 25 26 27
28 29 30            
          1  2  3  4 December
 5  6  7  8  9 10 11
12 13 14 15 16 17 18
19 20 21 22 23 24 25
26 27 28 29 30 31   

Almost there.

sed -E '
  # Not 7 days this week?
  /[0-9]+(\s+[0-9]+){6}/!{
    # and isn"t the first week of the first month?
    1b

    # We must be at the end of the month. Bring the start of the next month.
    N

    # Join the start of the next to this end.
    s/ +\n +/  /
  }
'

Result:

                1  2 January
 3  4  5  6  7  8  9
10 11 12 13 14 15 16
17 18 19 20 21 22 23
24 25 26 27 28 29 30
31  1  2  3  4  5  6 February
 7  8  9 10 11 12 13
14 15 16 17 18 19 20
21 22 23 24 25 26 27
28  1  2  3  4  5  6 March
 7  8  9 10 11 12 13
14 15 16 17 18 19 20
21 22 23 24 25 26 27
28 29 30 31  1  2  3 April
 4  5  6  7  8  9 10
11 12 13 14 15 16 17
18 19 20 21 22 23 24
25 26 27 28 29 30  1 May
 2  3  4  5  6  7  8
 9 10 11 12 13 14 15
16 17 18 19 20 21 22
23 24 25 26 27 28 29
30 31  1  2  3  4  5 June
 6  7  8  9 10 11 12
13 14 15 16 17 18 19
20 21 22 23 24 25 26
27 28 29 30  1  2  3 July
 4  5  6  7  8  9 10
11 12 13 14 15 16 17
18 19 20 21 22 23 24
25 26 27 28 29 30 31
 1  2  3  4  5  6  7 August
 8  9 10 11 12 13 14
15 16 17 18 19 20 21
22 23 24 25 26 27 28
29 30 31  1  2  3  4 September
 5  6  7  8  9 10 11
12 13 14 15 16 17 18
19 20 21 22 23 24 25
26 27 28 29 30  1  2 October
 3  4  5  6  7  8  9
10 11 12 13 14 15 16
17 18 19 20 21 22 23
24 25 26 27 28 29 30
31  1  2  3  4  5  6 November
 7  8  9 10 11 12 13
14 15 16 17 18 19 20
21 22 23 24 25 26 27
28 29 30  1  2  3  4 December
 5  6  7  8  9 10 11
12 13 14 15 16 17 18
19 20 21 22 23 24 25
26 27 28 29 30 31   

There we are. Not a perfect format, but good enough depending on the dates we’re curious about.

awk '
  $8 { m = $8 }

  m == "July" && /\<7\>/, m == "August" && /20/
'

Result:

 4  5  6  7  8  9 10
11 12 13 14 15 16 17
18 19 20 21 22 23 24
25 26 27 28 29 30 31
 1  2  3  4  5  6  7 August
 8  9 10 11 12 13 14
15 16 17 18 19 20 21

And at long last:

wc -l

Result:

6

Crap. :(

That 6 should be a 7. That seems to be a bug in org-mode. It strips the terminating newline of the previous block’s output for use as a variable in the next block, like how a shell’s "$()" expansion would, but it fails to add it when passing to the stdin of a block like how a shell’s <<< redirection would. Since the last piece of text doesn’t properly terminate with a newline, wc -l doesn’t consider it to be a line.

Oh well. At least I learned a bit about working with code from org-mode with this.

EDIT: For the sake of completeness, here’s the equivalent sed to the awk bit I did at the end:

sed -E '
  # Hold the month name
  /[a-z]$/{h;s/.* //;x}

  # Append it when month is missing
  /[a-z]$/!G

  # Delete all lines not between July 7th and August 20th
  /\b7\b.*July/,/20.*August/!d

  # Delete appended month
  s/\n.*//
'

Result:

 4  5  6  7  8  9 10
11 12 13 14 15 16 17
18 19 20 21 22 23 24
25 26 27 28 29 30 31
 1  2  3  4  5  6  7 August
 8  9 10 11 12 13 14
15 16 17 18 19 20 21