TransWikia.com

Bash script to find maximum number of a certain character (".") in any single line of a file

Unix & Linux Asked on December 8, 2021

There is a file with an unknown number of lines. In the file each line contains unknown many periods (.).

How can I find the maximum period number? I am not interested in finding the line that contains the most periods.

For example: Processing the file content below in bash should give the answer "4".

one.one
two.two.two
three.three.three.three
four..four.
five..five..
six...six

6 Answers

JAAOV (Just another awk obfuscating variant...)

awk 'gsub(/[^.]/,"") { print | "wc -L" }'

Answered by JJoao on December 8, 2021

One way with awk could be as follows. We need to realize that the following equality holds:

  • number of fields = number of delimiters + 1

Note that adding a 0 to the operand in arithmetic comparison, even though not always necessary, is a good practice to inculcate. At least it helps me think about one less thing, for it becomes an auto reflex coding action. Since Awk does not provide separate operators for arithmetic nd string comparisons, hence coercion is needed to help disambiguate a string from a math operand or rather context.

$ awk -F '[.]' '
    NF>m+0 {m=NF}
    END {print --m}
' file
4
$ awk '
    gsub(/[^.]+/, "") &&
    ! index(t, $0) { t = $0 }
    END { print length(t) }
' file
$ perl -lne '
    my $k = tr/.//;
    $k > $m and $m = $k;
    }{ print $m+0;
' file

The GNU sed editor can also be used in conjunction with the binary calculator bc utility. Idea is we keep lines stripped off of all non-dots and the current longest string of pure dots is held in hold. At eof, we transform the dots into an actionable bc code to generate the number of those dots.

$ sed -Ee '
    s/[^.]+//g;G
    /^(.*)..*n1$/!ba
    s/n.*//;h;:a
    $!d;g;s/./1+/g;s/$/0/
'  file | bc -l

Answered by Rakesh Sharma on December 8, 2021

Let's generate an example,

cat >file <<'X'
this.world.
this
1.2.3.4.5
all.is.done
X

With perl

perl -e 'while (<>) { $x = $n if ($n = ($_ =~ y/.//)) > $x } print "$xn"' file
4

With awk

awk '{ gsub("[^.]", ""); if ((n = length($0)) > x) { x = n } } END { print x }' file
4

With tr and a non-POSIX extended version of wc

tr -cd '.n' <file | wc -L
4

Answered by roaima on December 8, 2021

The awk-less answer:

sed 's/[^.]//g' test.dat | wc -L

In other words, keep only the dots, and use the -L option of wc: -L, --max-line-length: print the maximum display width

Answered by xenoid on December 8, 2021

Alternatively, you can count the number of a specific character, and leave the text unchanged for further processing, such as printing the line itself, or counting another character. gsub returns the number of replacements.

awk '{ nDot = gsub ("[.]", "."); etc .. }'

Answered by Paul_Pedant on December 8, 2021

You could do it with awk:

awk '{gsub(/[^.]/,""); len=length(); if (len>max) {max=len}} END{printf("Largest count of ".": %dn",max)}' file.txt

This will, for every line, replace all characters that are not ., by "nothing" (i.e. remove everything that is not a .). Then, it will count the length of the remaining string, and store the largest value found in max. At end-of-file, it will print the result.

Answered by AdminBee on December 8, 2021

Add your own answers!

Ask a Question

Get help from others!

© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP