Search code examples
pythonexcelduplicatesipautofill

Is there a way in python or Excel to duplicate rows and add a value to a certain column - explanation below


I have data that looks like this for IP addresses: for security reasons I am writing made up numbers here.

Subnet 1 Subnet 2 Site
5.22.128.0 17 Texas
5.22.0.0 17 Boston
etc etc etc

Question: Can I write a VBA or python code to do the below:

to take each Subnet 1 and: if the third octet is 128 then add 127 rows below it and fill them as such:

Subnet 1 Subnet 2 Site
5.22.128.0 17 Texas
5.22.129.0 17 Texas
5.22.130.0 17 Texas

.... all the way to:

Subnet 1 Subnet 2 Site
5.22.255.0 17 Texas

And if the third octet is 0 then do the same thing but from 0 to 127. while keeping the other data intact (Site and Subnet 2) the same.

I didn't really know where to begin so I don't have code but my thinking was:

either:

A. Change the decimals to commas to represent figures in millions then add a summation calc until it reaches certain numbers.

B.Create two lists one from 0-127 and one from 128-255 and then append them to the values on the columns but I still don't know how to get multiple rows for it.

I am fairly new but if there is anything wrong with the way the question is presented please let me know. - don't care if it is done through VBA or python as I can write both - Just need a direction as to how to start.


Solution

  • It can be done with pandas but requires a number of steps.

    0. Example data

    import pandas as pd
    
    df = pd.DataFrame({'Subnet 1': ['5.22.128.0', '5.22.0.0'],
                       'Subnet 2': [17, 17],
                       'Site': ['Texas', 'Boston']})
    

    1. Split IPs at dots

    >>> df2 = df["Subnet 1"].str.split(".", expand=True)
    >>> df2
       0   1    2  3
    0  5  22  128  0
    1  5  22    0  0
    

    2. Expand the IPs

    >>> df3 = (df2.join(df2.
                        pop(2). # Extract the third element
                        astype(int) # Cast the string to int
                        .apply(lambda start: range(start, start+128)) # expand the range
                        .explode() # explode into individual rows
                        .astype(str) # cast back into str
                       )
              )
    >>> df3
        0   1  3    2
    0   5  22  0  128
    0   5  22  0  129
    0   5  22  0  130
    0   5  22  0  131
    0   5  22  0  132
    .. ..  .. ..  ...
    1   5  22  0  123
    1   5  22  0  124
    1   5  22  0  125
    1   5  22  0  126
    1   5  22  0  127
    
    [256 rows x 4 columns]
    

    3. Join with original df

    >>> df_final = (df3[[0, 1, 2, 3]] # Select IPs in the right order
                    .agg('.'.join, axis=1) # Rebuild them
                    .to_frame(name="Subnet 1") # turn into df
                    .join(df.drop("Subnet 1", axis=1)) # join with original df
                   ) 
    >>> df_final
          Subnet 1  Subnet 2    Site
    0   5.22.128.0        17   Texas
    0   5.22.129.0        17   Texas
    0   5.22.130.0        17   Texas
    0   5.22.131.0        17   Texas
    0   5.22.132.0        17   Texas
    ..         ...       ...     ...
    1   5.22.123.0        17  Boston
    1   5.22.124.0        17  Boston
    1   5.22.125.0        17  Boston
    1   5.22.126.0        17  Boston
    1   5.22.127.0        17  Boston