Portfolio

/Amir Vincent Zaman/>_

avzaman2020@outlook.com
Somerville, NJ
B.S. Computer Science

Scripting? Automation? What the heck?!?!?

Hello! Welcome back to the schoolhouse, population python! Or whatever... Last time I posted a script here was a few months ago and I was gettin all handy with powershell. Now the handy thing about how handy Powershell is, is it's interaction with the Windows operating system. The caked in functionality and querying with the wmi database allows Powershell to do stuff in windows relatively seamlessly for a bunch of automatable taskables.

Here in comes the snek.

Python is a notiriously easy-to-use scripting language out of the box, and it's what I spent my undergraduate years mostly tampering with. Anything you can't do? Nope! There is likely a library for that, just pump that jawn into your environment and ur good.

So although I am in a learning phase with Powershell, my latest automation station posed Powershell to be more mid than anything.

So what're we doin here?

My latest automation tasker involved an archive system where many images were simply zipped up in batches if the batch of images was old as heck and taking up valuable drive space. But on occasion... one must recover archived batches and this is daunting task for a little guy like me to go one by one reading hundreds of batch names (long strings of random numbers) 7zipping unzipping each individual gentlman.

But I'm an IT guy...

and a programmer...

and best believe a programmer gon program.

Now -> powershell stinks smelly style at handling zips. The main method I am aware of is this Expand-Archive command that just unzips a whole dang archive folder. I need to go into the archive directory and be picky choosey. Luckily python has this caked in with the zipfile library.

Some interesting things to not with zipfile:

Lets do it Vince, lets unzip these files!

So here's the script, tailored to take in a text file that contains a list of all the batch folder names im tryna extract:

import os
import zipfile

namesdir = 'names.txt'
outdir = 'retrieved-archives'

# make the output folder
if not os.path.exists('./'+outdir):
    os.mkdir(outdir)

zips = []
for file in os.listdir('./'):
    if file[-4:] == '.zip':
        zips.append(file)

names = set()
with open(namesdir) as txtfile:
    for line in txtfile:
        names.add(line.strip()) 

print(zips)
# print(names)

for zip in zips:
    with zipfile.ZipFile(zip) as z:
        filemembers = z.namelist()
        root = zipfile.Path(z)
        print('Checking FILE: ',z)
        for folder in root.iterdir():
            # print('Checking FOLDER: ',folder.name)
            if folder.name in names:
                print('EXTRACTING FILES FROM: ',folder.name)
                for member in  filemembers:
                    if member.startswith(folder.name):
                        z.extract(member,outdir, pwd=b'capture')

Now let me explain..

The first loop setsup so we can loop over all the zips in the current windows directory, because these batches are archived in randomly intervald groupings, idk what archive is gonna have what I need.

The second loop handles populating a set of all the batch names im looking for. This will be quicker because then each zip needs only to opened once, and not for every name im looking to recover. Opening zips via this library can take a minute depending on the size.

The last big chunk is the core functionality. It looks inside of every zip in the current windows directory, then within each zip it checks the name of every folder in the root of that zip. If the name matches one in the set, then it starts extracting.

The weird thing about extracting this way, is that it only extracts that exact path, not recursively. So thats what that last loop does, just a manualy extraction recurse!

a few improvements mayhaps?

  1. use a library that is more efficient at zip i/o. zipfile is ez cuz it's caked into python raw
  2. have the log print to a file live and save it even on force stopping the script

Until next time :3

When I started this portfolio blog site thing I was like EVERY 2 WEEKS I AM ON THIS!!! But I think as long as there is some measurable consistency over a long span of time I will be content. So ye it's been a little over a month since me last check in and I've done/gotten some cool stuff I wish to talk about here. Mainly my transition to Arch linux from Windows on my home pc, but also I been retrofitting my tech in general. This has served several purposes:

I'll write about it eventually, and maybe get some discussy sussy going...

Until next time! Ciao!

~Vince