TransWikia.com

Regex for time stamps and corresponding lines

Unix & Linux Asked by Lesenger on December 24, 2021

I have a human written text file that contains time stamps in form of dd-mm-yyyy,HH:MM or HH:MM:SS. I have managed to extract time stamps from text file using regex but I would like to also get a line of corresponding time stamp. It would be nice to have time stamps in one file and corresponding lines in the other. There could be multiple time stamps per line so same line should occur multiple times.

If this can be done, what if I want only few words or few lines around a time stamp. Idea is just to get time stamps and their context extracted.

For now I have been using matlab for this, but any *nix tool will do.

Edit: seems to be that not all tools will do. I’m using mac and sometimes portable git bash for windows. At least mac’s grep doesn’t support anymore -P options for perl regex which is apparently needed for look around (?<![0-9])

Here is example of original file and desired outputs:

original:

L&L logfile

14-5-12
16-05-2012
Experiment 1
Device 77212-123-123123
Instrument 2, 34g, 66hz
Notes:
Something weird happened 12:34
Everything is fine 13:07
Log
8:00 routine 1
8:20 routine 2
8:40 routine 3, 8:45 something went south
8:50 routine 4, 8:50:12 weird peak at data

output1:

14-5-12
16-05-2012
12:34
13:07
8:00
8:20
8:40
8:45
8:50
8:50:12

output2:

14-5-12
16-05-2012
Something weird happened 12:34
Everything is fine 13:07
8:00 routine 1
8:20 routine 2
8:40 routine 3, 8:45 something went south
8:40 routine 3, 8:45 something went south
8:50 routine 4, 8:50:12 weird peak at data
8:50 routine 4, 8:50:12 weird peak at data

One Answer

grep -Eo '[0-9.]{10},[0-9]{2}:[0-9]{2}(:[0-9]{2})?' text.file

will produce just time-stamps. By remove -o option you'll receive full lines

grep -E '[0-9.]{10},[0-9]{2}:[0-9]{2}(:[0-9]{2})?' text.file

If pattern [0-9.]{10} will not produce correct output it can be easy to change for more strong ([0-9]{2}.){2}[0-9]{4}
If you'd like to do both task simultaneously it can be done by sed e.g.

sed -r '/[0-9.]{10},[0-9]{2}:[0-9]{2}(:[0-9]{2})?/w string.file
s/[^:]*([0-9.]{10},[0-9]{2}:[0-9]{2}(:[0-9]{2})?)/1n/;//P;D' text.file

Answered by Costas on December 24, 2021

Add your own answers!

Ask a Question

Get help from others!

© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP