Best Audio Format for Storage?

stevecrox@kbin.social · 6 months ago

Best Audio Format for Storage?

stevecrox@kbin.social · edit-2 11 months ago

This is about thew new starter cost.

When a developer joins a team, they will not be as productive as they have to learn the code, frameworks, libraries, the project purpose, the tooling, etc… Often this impacts other members of the team lowering the entire teams productivity.

When you use productivity tracking (e.g. things like capacity planning) you will see the teams performance drop and it will take time for it to exceed the previous measured performance. This is the cost of adding a new starter.

So if it takes 6 weeks for a new starter to increase overall team producitivty then planning someone on a project for 4 weeks is pointless since the team will have a higher delivery rate without the extra person. This is typically why an organsation loses its ability to migrate staff between projects.

Code formating affects the layout of the code and our brains do all sorts of tricks around pattern recognition, so if your code formatting rules are too different a someone migrating between projects has to spend time looking for code and retraining their brain.

Its an additional barrier and a one within an organisations skills to remove (by forcing a common code standard).

stevecrox@kbin.social · 11 months ago

The last part is why you use an IDE.

Several of them will ingest prettier files to build code formatting rules

IDE support is normally a good way to work out what the wider community is using.

stevecrox@kbin.social · edit-2 11 months ago

Python is unique in formatting forms part of the syntax, every language has linters but its far more common for orgs to tweak the default rules .

For example Java has Checkstyle. The default rules ‘sun checks’ give a line length of 80, tabs are 4 spaces and everything is placed on a new line.

Junior devs inevitably want to trash the line length (honestly on 1080p monitors, 120 makes sense,).

There is always a new line/same line discussion (everyone perfers same line but there is always one die hard new line person).

The tab width discussion always has one junior dev complain that “tabs are better”, as someone who started development on Visual Studio 6 where half the team double spaced, the other half used tabs. Those people get a lecture from me on how we can convert tabs to spaces but not the inverse so it will always be spaces if I am near.

With Checkstyle you upload the rule file as an artifact into your M2 repository. Then you can pull it down as a dependency when the checkstyle plugin runs.

stevecrox@kbin.social · 11 months ago

I avoid any company that requires a software test before the interview.

I worked for a company that introduced them after I joined, I collected evidence all of the companies top performers wouldn’t have joined since we all had multiple offers and having to do the test would put people off applying. The scores from it didn’t correlate with interview results so it was being ignored by everyone. Still took 2 years to get rid of it.

The best place used STAR (Situation Task Action Result) based interviews. The goal was to ask questions until you got 2 stars.

I thought these were great because it was more varied and conversational but there was a comparable consistency accross interviewers.

You would inevitably get references to past work and you switch to asking a few questions about that. Since it was around a situation you would get more complete technical explanations (e.g. on that project I wrote an X and Y was really challenging because of Z).

I loved asking “Tell me about something your really proud off”. Even a nervous junior would start opening up after that question.

After an hour interview you would end up with enough information you could compare them against the company gradings (junior, senior, etc…).

This was important because it changed the attitude of the interview. It wasn’t a case of if the candidate would be a good senior dev for project X, but an assessment of the candidate. If they came out as a lead and we had a lead role, lets offer them that.

stevecrox@kbin.social · 1 year ago

Basic rule if someone claims X magically solves a problem they don’t follow X and are a huge generator of the problem.

For example people who claim they don’t need to write comments because they write self documenting code are the people that use variable names x1,x2,y, etc…

Similarly anyone you meet claiming Test Driven Development means they have better tests will write code with appalling code coverage and epically bad tests.

stevecrox@kbin.social · 1 year ago

Nvidia drivers don’t tend to be as performant under linux.

With AMD instead of using the AMD VLK driver, you would use the RADV (developed largely by valve). Which petforms better.

Every AMD card under linux supports OpenCL (the driver is more based on graphics card architecture) and you install it very easily. Googling it with windows found pages of errors and missing support.

Blender supports OpenCL. I bet the 2x improvement is Blender being able to ofload rendering to the AMD graphics card.

Also this represents the biggest headache in Linux, lots of gamers insist they can only use Nvidia cards. Nvidia treats linux as an afterthought as best or deliberately sabotages things at worse.

AMD embraced open source and so Linux land is much nicer on AMD (and to a less extent Intel).

The results here will probably be a DxVK quirk, lots of “Nvidia optimised” games have game engines doing weird things and the Nvidia driver compensates. DxVK has been identifying that to produce “good” vulkan calls.

stevecrox@kbin.social · edit-2 1 year ago

This advice isn’t grounded in reality.

Management normally defines ways to track and judge itself, these are typically called Key Performance Indicators.

KPI’s are normally things like contract value growth, new contracts signed, profit margin, etc…

So if the project manager is meeting or exceeding their KPI’s and you walk up to their boss telling them the PM is failing as basic job functions, the boss won’t care.

This is because the boss might have set the KPI’s or the boss might also be judged on them. In either situation its to the bosses advantage to ignore you.

The boss will only care if there is a KPI you can demonstrate the PM failing to meet.

Every person/group will have various incentives and motivations. To affect change you have to understand what they are.

stevecrox@kbin.social · 1 year ago

A project manager has responsibility for delivery of a project but they typically lack domain specific knowledge. As a result they can’t directly deliver something, merely ask subject matter experts for advice and facilitate a team to deliver.

Most PM’s cope with the stress of this position poorly.

This cartoon is an example of micro management (a common coping mechanisim), the manager has involved themselves in the low level decisions because that gives a sense of control. If a technical team then tell them its a bad decison the team are effectively attacking their coping mechanisim.

The solution isn’t to tell them their technical idea is terrible, when you’ve fallen down this rabbit hole you have to treat the PM as a stakeholder. They are someone you have to manage, so a common solution is to give them confidence there is a path to delivery, a way to track and understand it.

stevecrox@kbin.social · 1 year ago

If you read the reports…

Normally JPL outsource their Mars mission hardware to Lockheed Martin. For some reason they have decided to do Mars Sample Return in house. The reports argue JPL hasn’t built the necessary in house experience and should have worked with LM.

Secondly JPL is suffering a staff shortage which is affecting other projects and the Mars Sample Return is making the problem worse.

Lastly if an organisation stops performing an action it “forgets” how to do it. You can rebuild the capability but it takes time.

A team arbitrary declaring they are experts and suddenly decideding they will do it is one that will have to relearn skills/knowledge on a big expensive high profile project. The project will either fail (and be declared a success) or masses of money will be spent to compensate for the teams learning.

Either situation is not ideal

stevecrox@kbin.social · edit-2 1 year ago

The GAO has performed an annual review of the Space Launch System every year since 2014 and switched to reviewing the Artemis program in 2019.

Each year the GAO points out Nasa isn’t tracking any costs and Nasa argues with the GAO about the costs they assign. Then the GAO points out Nasa has no concrete plan to reduce costs, Nasa then goes nu’uh (see the articles cost reduction “objectives”).

The last two reports have focused on the RS-25 engine, last time the GAO was unhappy because an engine cost Nasa $100 million and Nasa had just granted a development contract to reduce the cost of the engine.

However if you took the headline cost of the contract and split it over planned engines it was greater than the desired cost savings. Nasa response was development costs don’t count.

Congress reviews GAO reports and decides to give SLS more money.

stevecrox@kbin.social · 1 year ago

The other person was just wrong.

Large scale Hydrogen generation isn’t generated in a fossil free way, Hydrogen can be generated is a green way but the infrastructure isn’t there to support SLS.

Hydrogen is high ISP (miles per gallon) by rubbish thrust (engine torque).

This means SLS only works with Solid Rocket Boosters, these are highly toxic and release green house contributing material into the upper atmosphere. I suspect you would find Falcon 9/Starship are less polluting as a result.

Lastly the person implies SLS could be fueled by space sources (e.g. the moon).

SLS is a 2.5 stage rocket, the boosters are ditched in Earths Atmosphere and the first stage ditched at the edge of space. The current second stage doesn’t quite make low earth orbit.

So someone would have to mine materials on the moon and ship them back. This would be far more expensive than producing hydrogen on Earth.

Hydrogen on the moon makes sense if your in lunar orbit, not from Earth.

stevecrox@kbin.social · edit-2 1 year ago

Do not mix tabs and spaces.

Its impossible to automate checking that tabs were only used for indentation and spacing for precise alignment. So you then take on a burden of manually checking

You end up with the issue where someone didn’t realise and space idented or anouther person used tabs for precise alignment and people forget to check the whitespace characters in review and it ends up going inconsistent and becoming a huge pile of technical debt to fix.

Use only one, you can automate enforcement and ensure the code renders consistency.

stevecrox@kbin.social · edit-2 1 year ago

deleted by creator

stevecrox@kbin.social · edit-2 1 year ago

Years ago there was no way to share IDE settings between developers.

You ended up with some developers choosing a tab width of 2 spaces, some choosing 4 spaces and as there was no linting enforcement some people using 2-4 spaces depending on their IDE settings.

This resulted in an unreadable mess as stuff was idented to all sorts of random levels.

It doesn’t matter if you use tabs or spaces as long as only one type is consistently used within a project.

Spaces tends to win because inevitably there are times you need to use spaces and so its difficult to ensure a project only uses tabs for identation.

IDE’s support converting tabs into spaces based on tab width and code formatting will ensure correct indentation. You can now have centralised IDE settings so everyone gets the same setup.

Honestly 99% of people don’t care about formatting (they only care when consistency isn’t enforced and code is hard to read), there is always one person who wants a 60 charracter line width or only tabs or double new lined parathensis. Who then sucks up huge amounts of the team time arguing their thing is a must while they code in emacs, unlike the rest of the team using an actual ide.

stevecrox@kbin.social · 1 year ago

I am actually arguing for a stable ABI.

The few times I have had to compile out of tree drivers for the linux kernel its usually failed because the ABI has changed.

Each time I have looked into it, I found code churn, e.g. changing an enum to a char (or the other way) or messing with the parameter order.

If I was empire of the world, the linux kernel would be built using conan.io, with device trees pulling down drivers as dependencies.

The Linux ABI Headers would move out into their own seperately managed project. Which is released and managed at its own rate. Subsystem maintainers would have to raise pull requests to change the ABI and changing a parameter from enum to char because you prefer chars wouldn’t be good enough.

Each subsystem would be its own “project” and with a logical repository structure (e.g. intel and amd gpu drivers don’t share code so why would they be in the same repo?) And built against the appropriate ABI version with each repository released at its own rate.

Unsupported drivers would then be forked into their own repositories. This simplifies depreciation since its external to the supported drivers and doesn’t need to be refactored or maintained. If distributions can build them and want to include the driver they can.

Linus job would be to maintain the core kernel, device trees and ABI projects and provide a bill of materials for a selection of linux kernel/abi/drivers version which are supported.

Lastly since every driver is a descrete buildable component, it would make it far easier for distributions to check if the driver is compatible (e.g. change a dependency version and build) with the kernel ABI they are using and provide new drivers with the build.

None of this will ever happen. C/C++ developers loath dependency management and people can ve stringly attached to mono repos for some reason.

stevecrox@kbin.social · edit-2 1 year ago

The linux kernel is very old school in how it is run and originally a big part of the DevSecOps movement was removing a lot of manual overhead.

Moving on to something like Gitea (codeberg) would give you a better diff view and is quicker/easier than posting a patch to a mailing list.

The branching model of the kernel is something people write up on paper that looks great (much like Gitflow) but is really time consuming to manage. Moving to feature branch workflow and creating a release branches as part of the release process allows a ton of things to be automated and simplified.

Similarly file systems aren’t really device specific, so you could build system tests for them for benchmarking and standard use cases.

Setting up a CI to perform smoke testing and linting, is fairly standard.

Its really easy to setup a CI to trigger when a new branch/pr is created/updated, this means review becomes reduced to checking business logic which makes reviews really quick and easy.

Similarly moving on to a decent issue tracker, Jira’s support for Epic’s/stories/tasks/capabilities and its linking ability is a huge simplifier for long term planning.

You can do things like define OKR’s and then attach Epics to them and Stories/tasks to epics which lets you track progress to goals.

You can use issues the way the linux community currently uses mailing lists.

Combined with a Kanban board for tracking, progress of tickets. You remove a ton of pain.

Although open source issue trackers are missing the key productivity enablers of Jira, which makes these improvements hard to realise.

The issue is people, the linux kernel maintainers have been working one way for decades. Getting them to adopt new tools will be heavily resisted, same with changing how they work.

Its like everyone outside, knows a breaking the ABI definition from the sub system implementation would create a far more stable ABI which would solve a bunch of issues and allow change when needed, except no one in the kernel will entertain the idea.

stevecrox@kbin.social · 1 year ago

During the pandemic I had some unoccupied python graduates I wanted to teach data engineering to.

Initially I had them implement REST wrappers around Apache OpenNLP and SpaCy and then compare the results of random data sets (project Gutenberg, sharepoint, etc…).

I ended up stealing a grad data scientist because we couldn’t find a difference (while there was a difference in confidence, the actual matches were identical).

SpaCy required 1vCPU and 12GiB of RAM to produce the same result as OpenNLP that was running on 0.5 vCPU and 4.5 GiB of RAM.

2 grads were assigned a Spring Boot/Camel/OpenNLP stack and 2 a Spacy/Flask application. It took both groups 4 weeks to get a working result.

The team slowly acquired lockdown staff so I introduced Minio/RabbitMQ/Nifi/Hadoop/Express/React and then different file types (not raw UTF-8, but what about doc, pdf, etc…) for NLP pipelines. They built a fairly complex NLP processing system with a data exploration UI.

I figured I had a group to help me figure out Python best approach in the space, but Python limitations just lead to stuff like needing a Kubernetes volume to host data.

Conversely none of the data scientists we acquired were willing to code in anything but Python.

I tried arguing in my company of the time there was a huge unsolved bit of market there (e.g. MLOP’s)

Alas unless you can show profit on the first customer no business would invest. Which is why I am trying to start a business.

stevecrox@kbin.social · 1 year ago

This is why Java rocks with ETL, the language is built to access files via input/output streams.

It means you don’t need to download a local copy of a file, you can drop it into a data lake (S3, HDFS, etc…) and pass around a URI reference.

Considering the size of Large Language Models I really am surprised at how poor streaming is handled within Python.

stevecrox@kbin.social · 1 year ago

I think its a self burn.

Person has never been in a relationship and so has no ex to photograph

stevecrox@kbin.social · 1 year ago

Maven has unit and integration test phases and there are a multitude of plugins designed to hook into those phases but there are constraints by design.

Trying to hook everything into the build management system is a source of technical debt, your using a tool for something it wasn’t designed.

I would look at what makes sense within the build management system and what makes sense in a CI pipeline.

CI tools have different DSL and usually provide a means to manage environments. Certain integration and system level tests are best performed there.

For instance I keep system tests as a seperate managed project. The project can be executed from developer machines for local builds but I also create a small build pipeline to build the project, deploy it and run the system tests against it triggered by pull requests.

This is why I say the build management system doesn’t really change, because you should treat everything as descrete standalone components.

The Parent POM gets updates once every six months, the basic build verification CI pipeline only changes to the latest language release, etc…

Projects which try to embed gitflow into a pom or integrate CD into the gradle file are the unbuildable messes I get asked to fix.

stevecrox@kbin.social · edit-2 1 year ago

Interview with a Postdoc, Junior Python Developer

stevecrox

Best Audio Format for Storage?

Best Audio Format for Storage?

Interview with a Postdoc, Junior Python Developer

Interview with a Postdoc, Junior Python Developer