Wednesday, May 27, 2020

CSharp - How to get either first row or last row of repeating values fast

Here's a design pattern to get either first or last row of a repeating field on multiple rows. In this example we'll extract the "File size" as a field to group on.

Internet media type                      : video/x-ms-wmv
File size                                : 372673
File size                                : 364 KiB
File size                                : 364 KiB
File size                                : 364 KiB
File size                                : 364 KiB
File size                                : 363.9 KiB
Duration                                 : 29166
Duration                                 : 29 s 166 ms
Duration                                 : 29 s 166 ms
Duration                                 : 29 s 166 ms
Duration                                 : 00:00:29.166
Duration                                 : 00:00:29:02
Duration                                 : 00:00:29.166 (00:00:29:02)
Overall bit rate                         : 102221
Overall bit rate                         : 102 kb/s
Maximum Overall bit rate                 : 103080
Maximum Overall bit rate                 : 103 kb/s

Result

Internet media type                      : video/x-ms-wmv
File size                                : 363.9 KiB
Duration                                 : 00:00:29.166 (00:00:29:02)
Overall bit rate                         : 102 kb/s
Maximum Overall bit rate                 : 103 kb/s

Here's a design pattern to get either 1st or last row of a repeating field, that is fast and does not rely on Linq library.

            string m = @"
Internet media type                      : video/x-ms-wmv
File size                                : 372673
File size                                : 364 KiB
File size                                : 364 KiB
File size                                : 364 KiB
File size                                : 364 KiB
File size                                : 363.9 KiB
Duration                                 : 29166
Duration                                 : 29 s 166 ms
Duration                                 : 29 s 166 ms
Duration                                 : 29 s 166 ms
Duration                                 : 00:00:29.166
Duration                                 : 00:00:29:02
Duration                                 : 00:00:29.166 (00:00:29:02)
Overall bit rate                         : 102221
Overall bit rate                         : 102 kb/s
Maximum Overall bit rate                 : 103080
Maximum Overall bit rate                 : 103 kb/s"; 
            
            string field = string.Empty;
            string prevfield = string.Empty;
            int idxsemi = 0;
            
            string[] linesIn = m.Split(new[] { Environment.NewLine },StringSplitOptions.None); 
            string[] linesOut = new string[linesIn.Length];
            
            int idxOut = linesIn.Length - 1;
            for (int i = linesIn.Length - 1; i >= 0; i--) //get last field in a repeating list
          //for (int i = 0; i < linesIn.Length; i++)     //get first field in a repeating list             
            {           
                idxsemi = linesIn[i].IndexOf(':');
                if (idxsemi > -1)
                    field = linesIn[i].Substring(0, idxsemi - 1); //field to dedup
                else
                    field = linesIn[i];

                if (prevfield == field)
                    continue;

                linesOut[idxOut--] = linesIn[i];
                
                prevfield = field;
                
            }

            string final = string.Join(Environment.NewLine, linesOut).TrimStart(); //gets rid of head empty lines
        }
        
    }
    



No comments:

Post a Comment