Autore: affinitoalessandro Page 1 of 6

Google has agreed to a settlement over a class-action lawsuit regarding Chrome’s “Incognito” mode, which involves deleting billions of data records of users’ private browsing activities.
The settlement includes maintaining a change to Incognito mode that blocks third-party cookies by default, enhancing privacy for users and reducing the data Google collects.

Profile-guided optimization – The Go Programming Language (golang.org)

Go: The Complete Guide to Profiling Your Code | HackerNoon

Have you already tried Go profiling with PGO?

More informed compiler optimizations lead to better application performance.
Profiles from already-optimized binaries can be used, allowing for an iterative lifecycle of continuous improvement.
Go PGO is designed to be robust to changes between the profiled and current versions of the application.
Storing profiles in the source repository simplifies the build process and ensures reproducible builds.

https://jvns.ca/blog/2024/02/16/popular-git-config-options/#commit-verbose-true

Here’s a list of useful git options that could be very useful!

https://www.srepath.com/clearing-observability-delusions/

Observability is highlighted as the fundamental practice for all other Site Reliability Engineering (SRE) areas, essential for avoiding “flying blind.”

The article discusses common misconceptions that hinder success in observability, emphasizing the need for the right mindset and avoidance of overly complex solutions.§

The shift towards event-based Service Level Objectives (SLOs) is recommended over time-based metrics, advocating for simplicity and the importance of leadership support in SLO implementation.

https://blog.plerion.com/hacking-terraform-state-privilege-escalation/

The article discusses the security risks associated with Terraform state files in DevOps, particularly when an attacker gains the ability to edit them.

It highlights that while the Terraform state should be secure and only modifiable by the CI/CD pipeline, in reality, an attacker can exploit it to take over the entire infrastructure.
The piece emphasizes the importance of securing both the Terraform files and the state files, as well as implementing measures like state locking and permission configurations to prevent unauthorized access and modifications.
It also explores the potential for attackers to use custom providers to execute malicious code during the Terraform initialization process.

https://thehackernews.com/2024/03/microsoft-confirms-russian-hackers.html

The article details a cybersecurity breach where the Russian hacker group Midnight Blizzard accessed Microsoft’s source code and internal systems.

Microsoft confirmed the breach originated from a password spray attack on a non-production test account without multi-factor authentication.

The attack, which began in November 2023, led to the theft of undisclosed customer secrets communicated via email. Microsoft has contacted affected customers and increased security measures, but the full extent and impact of the breach remain under investigation. The incident highlights the global threat of sophisticated nation-state cyber attacks.

#3 Sharing Friday

By affinitoalessandro

On 2024-03-29

In devops, kubernetes, news, sharing

https://blog.cloudflare.com/harnessing-office-chaos

This page provides an in-depth look at how Cloudflare harnesses physical chaos to bolster Internet security and explores the potential of public randomness and timelock encryption in applications.

There is the story of Cloudflare’s LavaRand, a system that uses physical entropy sources like lava lamps for Internet security, has grown over four years, diversifying beyond its original single source.
Cloudflare handles millions of HTTP requests secured by TLS, which requires secure randomness.
LavaRand contributes true randomness to Cloudflare’s servers, enhancing the security of cryptographic protocols.

https://radar.cloudflare.com/security-and-attacks

Here’s you can find a very interesting public dashboard provided by CloudFlare showing a lot of stats about current cyber attacks

avelino/awesome-go: A curated list of awesome Go frameworks, libraries and software (github.com)

A curated list of awesome Go frameworks, libraries and software

https://www.anthropic.com/news/claude-3-family

ChatGPT4 has been beaten.

Introducing three new AI models – Haiku, Sonnet, and Opus – with ascending capabilities for various applications1.
Opus and Sonnet are now accessible via claude.ai and the Claude API, with Haiku coming soon.
Opus excels in benchmarks for AI systems.

All models feature improved analysis, forecasting, content creation, code generation, and multilingual conversation abilities.

kubectl trick of the week.

.bahsrc

function k_get_images_digests {
  ENV="$1";
  APP="$2"
  kubectl --context ${ENV}-aks \
          -n ${ENV}-security get pod \
          -l app.kubernetes.io/instance=${APP} \
          -o json| jq -r '.items[].status.containerStatuses[].imageID' |uniq -c
}

alias k-get-images-id=k_get_images_digests

Through this alias you can get all the image digests of a specific release filtering by its label and then filter for unique values

#2 Sharing Friday

By affinitoalessandro

On 2024-03-22

In news, sharing

News

Found a new security bug in Apple M-series chipset
The article discusses a new vulnerability in Apple’s M-series chips that allows attackers to extract secret encryption keys during cryptographic operations.
The flaw is due to the design of the chips’ data memory-dependent prefetcher (DMP) and cannot be patched directly, potentially affecting performance.
Redis is changing its licensing
Redis is adopting a dual licensing model for all future versions starting with Redis 7.4, using RSALv2 and SSPLv1 licenses, moving away from the BSD license.
Future Redis releases will integrate advanced data types and processing engines from Redis Stack, making them freely available as part of the core Redis product.
The new licenses restrict commercialization and managed service provision of Redis, aiming to protect Redis’ investments and its open source community.
Redis will continue to support its community and enterprise customers, with no changes for existing Redis Enterprise customers and continued support for partner ecosystem.
Nobody wants to work with our best engineer
The article discusses the challenges faced with an engineer who was technically skilled but difficult to work with.
It highlights the importance of teamwork and collaboration in engineering, emphasizing that being right is less important than being effective and considerate.

Bash

Get your current branch fast up-to-date with master with this alias

alias git-update-branch="current_branch=$(git branch --show-current); git switch master && git pull --force && git switch $current_branch && git merge master"

Software Architecture

Chubby OSDI paper by Mike Burrows
and here’s their presentation on this topic
https://www.usenix.org/conference/srecon23emea/presentation/virji
Chubby is intended to provide coarse-grained locking and reliable storage for loosely-coupled distributed systems, prioritizing availability and reliability over high performance.

It has been used to synchronize activities and agree on environmental information among clients, serving thousands concurrently.

Similar to a distributed file system, it offers advisory locks and event notifications, aiding in tasks like leader election for services like the Google File System and Bigtable.

The emphasis is on easy-to-understand semantics and moderate client availability, with less focus on throughput and storage capacity.

Database Simplification: It mentions the simplification of the system through the creation of a simple database using write-ahead logging and snapshotting.

Introduction to Google Site Reliability Engineering slides by Salim Virji
The presentation introduces key concepts related to SRE, emphasizing the importance of automating processes for reliability and efficiency.

It also delves into the delicate balance between risk-taking and maintaining system stability.

Throughout the slides, the material highlights teamwork, effective communication, and the impact of individual behavior within engineering teams. Overall, the session aims to equip students with practical insights for successful SRE practices while navigating the complexities of modern software systems.

#1 Sharing Friday

By affinitoalessandro

On 2024-03-18

In sharing

Kubernetes

To quickly check for all images in all #pods from a specific release (eg: Cassandra operator):

kubectl get pods -n prod-kssandra-application -l app.kubernetes.io/created-by=cass-operator -o jsonpath="{.items[*].spec.containers[*].image}" | tr -s '[[:space:]]' '\n' |sort |uniq -c

Kubernetes Community Day (#KCD) Italy 2024 is getting closer!

AI

If you are an #Azure customer you may be eligible for an Azure #OpenAI account, you can choose between different models and use cases.
Azure OpenAI Service – Pricing | Microsoft Azure

News

GitHub is struggling to contain an ongoing #attack that’s flooding the site with millions of code repositories. These repositories contain obfuscated malware that steals passwords and cryptocurrency from developer devices.
Please make sure to use only well-known repos from known organizations, do not consider the number of stars a valid metric.

After the big license change now someone talks about a big sell in Hashicorp 😦
https://archive.is/iPekU

Bash

To generate strong random #password you don’t need online suspicious services but just old plain bash/WSL.
This function leverages your filesystem folder /dev/urandom,
the output is cryptographically secure and we then match only acceptable characters in a list and finally cut a 16 length string.

Keep it with you as an alias in your .bashrc maybe 🙂

function getNewPsw(){   
  tr -dc 'A-Za-z0-9!"#$%&'\''()*+,-./:;<=>?@[\]^_`{|}~' </dev/urandom | head -c 16; echo 
}

You can check the OWASP recommended special characters here:
https://owasp.org/www-community/password-special-characters

SAFe VS Platform Engineering

By affinitoalessandro

On 2023-03-30

In devops, management

I know this is a very opinionated topic and "agile coaches" everywhere are ready to fight, so I'll try to keep it short making clear this is based just on my experience and on discussions with other engineers and managers in different companies and levels.

We’re a team of Scaled Agile SRE,
Working together to deliver quality,
Breaking down silos and communication gaps,
We’re on a mission to make sure nothing lacks.

We follow the SAFe framework to a tee,
With its ARTs and PI planning, we’re not so free,
To deliver value in every sprint,
Continuous delivery is our mint.

Chorus:
Scaled Agile and SRE,
Together we achieve,
Quality and speed,
We’re the dream team.

We prioritize work and plan ahead,
Collaborate and ensure nothing’s left unsaid,
We monitor, measure, and analyze,
Our systems to avoid any surprise.

Chorus

We take ownership and accountability,
To deliver value with reliability.

Chorus

So when you need to deliver at scale,
You know who to call and who won’t fail,
Scaled Agile SRE,
Together we’re the ultimate recipe.
ChatGPT4 & me

To not make this post too verbose I’ll try to focus only on two points that I find paramount in a SRE team living in a Scaled Agile framework (SAFe) with a Kanban style approach: capacity planning and value flow.

Capacity

What is your definition of capacity?

Most of the teams don’t ask this simple question to themselves and then struggle for months to give a better planning. Is that the sum of our hours per day? Or is it calculated based on each one capacity after removing the average amount of support, maintenance, security fixes and operations emergencies?

While learning to drive, in general but even more for a motorcycle, you’re introduced to the paradoxical concept of “expect the unexpected!“

Of course, this won’t save always your life but surely it can reduce a lot the probability of you having an accident. It will because you’ll stick to some best practices tested in tens of years of driving. Like to not surpass while not seeing the exit of a turn, don’t drive too close to the previous vehicle, always consider the status of the road, the surroundings and your tires before speeding up…

The good part of computer science is that you have a lot of incidents!

But this becomes a value only if you start measuring them and then learning from them.

So we should consider our work less like artistic craftsmanship and more from a statistical point of view, going back over the closed user stories and trying to get some average completion time splitting by categories (support, emergencies, toil elimination, research…)

Nobody complains!

You have now a rough estimation of how much time is spent on variable actions and maintenance, let’s say 20 hours per week.

You know also your fixed appointments will be at least 20 min per day for the daily meeting, 1 hour per week to share issues coming from development teams and 1 hour for infrastructure refinement (open tasks evaluation, innovations to adopt or to share with the team…).

Let’s say you won’t be neither on support (answering dev teams questions and providing them new resources) nor on call (supporting operations team solving emergencies).

This will give you around 40 – 20 – 1 (dailies) – 1 (weekly) – 1 (infra) – 1 (dev team weekly) – 0.5 (weekly with your manager) = 15.5 h/w of capacity, meaning 31h of capacity for the next iteration if it lasts two weeks.

Probably less since you know you have already other two periodical useless meeting of one hour each, so let’s round to 13 h/w ≈ 150 min/day of “uninterrupted” work.

Well… actually to not get crazy and start physically fighting my hardware I need a couple of breaks, let’s say 15 min in the morning and the same in the middle of the afternoon.

That means ≈ 120 min/day of “uninterrupted” work.

Fine, I assume I can take that user story we’ve evaluated 10h with high priority for the next week and a smaller one for the next week leaving some contingency space.

We publish this results in the PI planning and to the management, and nobody complains.

Long story short: if nobody ever complains probably you’re not involving stakeholders correctly in your PI Planning or worse you’re not involving them at all!

And that’s bad.

Why are you working on those features?

Why those features exist in first place?

If your team is decoupled from the business view, are you sure that all this effort will help something? Or do you smell re-work and failure?

We should mention also that these planning didn’t leave any space for research and creative thinking. People will start solving issues quick and dirty more and more.

Yeah, I could call Moss and Roy for a good pair programming sessions since they have already solved this issue in the last iteration but… who wants another meeting? Let’s copy paste this work around and go on for now…

How much value has my work?

To measure value, we need some kind of indicator.

There are a lot of articles on cons and pros about setting metrics for our goal even before starting. Let’s say here that you want to have a few custom indicators that proves to be a good estimation based on previous experience, they should take in consideration side effects and they should be some kind of aggregated result meaning that they shouldn’t be easily hackable (working only to improve the metrics and not the quality).

Maybe we introduce general service availability and average service response time as two service level indicators (SLI).

Then we start having management working on Value Stream Analysis to understand where this values since it was requested as a new feature by the customers before the current agile train.

They succeed to reduce periodical meetings by 50% and increase 1 to 1 communication. Now dev teams are able to solve issues by themselves thanks to better documentation and run-books etc…

Conclusions

Imagine you are trying to implement a complex application in Golang, after a while you’re still failing, so you decide to switch to Java Quarkus, that you don’t know and to mess around because you heard it is easier. After a while guess what? It still doesn’t work.

The same is for the Agile frameworks. People expect them to solve stuff auto-magically, but if we don’t put effort into changing our own behavior, into measuring our-self in order to improve (and not to give our manager micromanagement power), using the latest agile methodology will never solve our Friday afternoon issues.

Sources

Implementing continuous SBOM analysis

By affinitoalessandro

On 2023-02-10

In devops, security

From-cves-scanners-to-sbom-generation
You are here!
Dependency Track – To come!

After the deep theoretical dive of the previous article let’s try to translate all that jazz in some real example and practical use cases for implementing a continuous SBOM file generation.

Verse 1)
Grype and Syft, two brothers, so true
In the world of tech, they’re both making their due
One’s all about security, keeping us safe
The other’s about privacy, a noble crusade

(Chorus)
Together they stand, with a mission in hand
To make the digital world a better place, you understand
Grype and Syft, two brothers, so bright
Working side by side, to make the world’s tech just right

(Verse 2)
Grype’s the strong one, he’s got all the might
He’ll protect your data, day and night
Syft’s got the brains, he’s always so smart
He’ll keep your secrets, close to your heart

(Chorus)
ChatGPT

[Azure pipelines] Grype + Syft

Following there is a working example of a sample Azure pipeline comprehending two templates for having a vulnerabilities scanner job and a parallel SBOM generation.

The first job will leverage Grype, a known open-source project by Anchore, while for the second one we will use its brother/sister Syft.

At the beginning what we do is to make sure this become a continuous scanning by selecting pushes on master as a trigger action, for example to have it start after each merge on a completed pull request.

You can specify the full name of the branch (for example, master) or a wildcard (for example, releases/*). See Wildcards for information on the wildcard syntax. For more complex triggers that use exclude or batch, check the full syntax on Microsoft documentation.

In the Grype template we will

download the latest binary from the public project
set the needed permissions to read and execute the binary
check if there is a grype.yaml with some extra configurations
run the vulnerability scanner on the given image. The Grype databse will be updated before each scan
save the results in a file “output_grype”
use the output_grype to check if there are alerts that are at least High, if so we want also a Warning to be raised in our Azure DevOps web interface.

In the Syft template we will have a similar list of parameter, with the addition of the SBOM file format (json, text, cyclonedx-xml, cyclonedx-json, and much more).

After scanning our image for all its components we then publish the artifact in our pipeline, since probably we’ll want to pull this list from a SBOM analysis tool (i.e: OWASP Dependency-Track, see previous article).

Go to the code below. |🆗tested code |

Github Actions

In GitHub it would be even easier since Syft is offered as a service by an Anchore action.

By default, this action will execute a Syft scan in the workspace directory and upload a workflow artifact SBOM in SPDX format. It will also detect if being run during a GitHub release and upload the SBOM as a release asset.

A sample would be something like this:

name: Generate and Publish SBOM

on:
  push:
    branches:
      - main

env:
  DOCKER_IMAGE: <your-docker-image-name>
  ANCHORE_API_KEY: ${{ secrets.ANCHORE_API_KEY }}
  SBOM_ANALYSIS_TOOL_API_KEY: ${{ secrets.SBOM_ANALYSIS_TOOL_API_KEY }}

jobs:
  generate_sbom:
    runs-on: ubuntu-20.04

    steps:
    - name: Checkout code
      uses: actions/checkout@v2

    - name: Generate SBOM using Anchore SBOM Action
      uses: anchore/actions/generate-sbom@v1
      with:
        image_reference: ${{ env.DOCKER_IMAGE }}
        api_key: ${{ env.ANCHORE_API_KEY }}

    - name: Publish SBOM
      uses: actions/upload-artifact@v2
      with:
        name: sbom.json
        path: anchore_sbom.json

Code Samples

cve-sbom-azure-pipeline.yml

You like it You click it!

From CVEs scanners to SBOM generation

By affinitoalessandro

On 2023-02-04

In devops, security

Example of Software Life Cycle and Bill of Materials Assembly Line

DevOps companies have always been in a constant pursuit of making their software development process faster, efficient, and secure. In the quest for better software security, a shift is happening from using traditional vulnerability scanners to utilizing Software Bill of Materials (SBOM) generation. This article explains why devops companies are making the switch and how SBOM generation provides better security for their software.

A CVE is known to all, it’s a security flaw call
It’s a number assigned, to an exposure we’ve spied
It helps track and prevent, any cyber threats that might hide!

Vulnerability scanners are software tools that identify security flaws and vulnerabilities in the code, systems, and applications. They have been used for many years to secure software and have proven to be effective. However, the increasing complexity of software systems, the speed of software development, and the need for real-time security data have exposed the limitations of traditional vulnerability scanners.

Executive Order 14028

Executive Order 14028, signed by President Biden on January 26, 2021, aims to improve the cybersecurity of federal networks and critical infrastructure by strengthening software supply chain security. The order requires federal agencies to adopt measures to ensure the security of software throughout its entire lifecycle, from development to deployment and maintenance.

NIST consulted with the National Security Agency (NSA), Office of Management and Budget (OMB), Cybersecurity & Infrastructure Security Agency (CISA), and the Director of National Intelligence (DNI) and then defined “critical software” by June 26, 2021.

Such guidance shall include standards, procedures, or criteria regarding providing a purchaser a Software Bill of Materials (SBOM) for each product directly or by publishing it on a public website.

Object Model

SBOM generation is a newer approach to software security that provides a comprehensive view of the components and dependencies that make up a software system. SBOMs allow devops companies to see the full picture of their software and understand all the components, including open-source libraries and dependencies, that are used in their software development process. This information is critical for devops companies to have, as it allows them to stay on top of security vulnerabilities and take the necessary measures to keep their software secure.

The main advantage of SBOM generation over vulnerability scanners is that SBOMs provide a real-time view of software components and dependencies, while vulnerability scanners only provide information about known vulnerabilities.

One practical example of a SBOM generation tool is Trivy, an open-source vulnerability scanner for container images and runtime environments. It detects vulnerabilities in real-time and integrates with the CI/CD pipeline, making it an effective tool for devops companies.

Another example is Anchore Grype, a cloud-based SBOM generation tool that provides real-time visibility into software components and dependencies, making it easier for devops companies to stay on top of security vulnerabilities.

Finally, Dependency Track is another great tool by OWASP that allows organizations to identify and reduce risk in the software supply chain.
The Open Web Application Security Project^® (OWASP) is a nonprofit foundation that works to improve the security of software through community-led open-source software projects.

The main features of Dependency Track include:

Continuous component tracking: Dependency Track tracks changes to software components and dependencies in real-time, ensuring up-to-date security information.
Vulnerability Management: The tool integrates with leading vulnerability databases, including the National Vulnerability Database (NVD), to provide accurate and up-to-date information on known vulnerabilities.
Policy enforcement: Dependency Track enables organizations to create custom policies to enforce specific security requirements and automate the enforcement of these policies.
Component Intelligence: The tool provides detailed information on components and dependencies, including licenses, licenses and age, and other relevant information.
Integration with DevOps tools: Dependency Track integrates with popular DevOps tools, such as Jenkins and GitHub, to provide a seamless experience for devops teams.
Reporting and Dashboards: Dependency Track provides customizable reports and dashboards to help organizations visualize their software components and dependencies, and identify potential security risks.

References

CKS Challenge #1

By affinitoalessandro

On 2022-04-20

In kubernetes, security

Here we’re going to see together how to solve a bugged Kubernetes architecture, thanks to a nice KodeKloud challenge, where:

The persistent volume claim can’t be bound to the persistent volume
Load the ‘AppArmor` profile called ‘custom-nginx’ and ensure it is enforced.
The deployment alpha-xyz use an insecure image and needs to mount the ‘data volume’.
‘alpha-svc’ should be exposed on ‘port: 80’ and ‘targetPort: 80’ as ClusterIP
Create a NetworkPolicy called ‘restrict-inbound’ in the ‘alpha’ namespace. Policy Type = ‘Ingress’. Inbound access only allowed from the pod called ‘middleware’ with label ‘app=middleware’. Inbound access only allowed to TCP port 80 on pods matching the policy
‘external’ pod should NOT be able to connect to ‘alpha-svc’ on port 80

1 Persistent Volume Claim

So first of all we notice the PVC is there but is pending, so let’s look into it

One of the first differences we notice is the kind of access which is ReadWriteOnce on the PVC while ReadWriteMany on the PV.

Also we want to check if that storage is present on the cluster.

Let’s fix that creating a local-storage resource:

Get the PVC YAML, delete the extra lines and modify access mode:

apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  finalizers:
  - kubernetes.io/pvc-protection
  name: alpha-pvc
  namespace: alpha
spec:
  accessModes:
  - ReadWriteMany
  resources:
    requests:
      storage: 1Gi
  storageClassName: local-storage
  volumeMode: Filesystem

Now the PVC is “waiting for first consumer”.. so let’s move to deployment fixing 🙂

https://kubernetes.io/docs/concepts/storage/persistent-volumes/#persistentvolumeclaims

https://kubernetes.io/docs/concepts/storage/storage-classes/#local

2 App Armor

Before fixing the deployment we need to load the App Armor profile, otherwise the pod won’t start.

To do this we move our profile inside /etc/app-arrmor.d and enable it enforced

3 DEPLOYMENT

For this exercise the permitted images are: ‘nginx:alpine’, ‘bitnami/nginx’, ‘nginx:1.13’, ‘nginx:1.17’, ‘nginx:1.16’and ‘nginx:1.14’.
We use ‘trivy‘ to find the image with the least number of ‘CRITICAL’ vulnerabilities.

Let’s give it a look at what we have now

apiVersion: apps/v1
kind: Deployment
metadata:
  creationTimestamp: null
  labels:
    app: alpha-xyz
  name: alpha-xyz
  namespace: alpha
spec:
  replicas: 1
  selector:
    matchLabels:
      app: alpha-xyz
  strategy: {}
  template:
    metadata:
      creationTimestamp: null
      labels:
        app: alpha-xyz
    spec:
      containers:
      - image: ?
        name: nginx

We can start scanning all our images to see that the most secure is the alpine version

So we can now fix the deployment in two ways

put nginx:alpine image
add alpha-pvc as a volume named ‘data-volume’
insert the annotation for the app-armor profile created before

apiVersion: apps/v1
kind: Deployment
metadata:
  creationTimestamp: null
  labels:
    app: alpha-xyz
  name: alpha-xyz
  namespace: alpha
spec:
  replicas: 1
  selector:
    matchLabels:
      app: alpha-xyz
  strategy: {}
  template:
    metadata:
      labels:
        app: alpha-xyz
      annotations:
        container.apparmor.security.beta.kubernetes.io/nginx: localhost/custom-nginx
    spec:
      containers:
      - image: nginx:alpine
        name: nginx
        volumeMounts:
        - name: data-volume
          mountPath: /usr/share/nginx/html
      volumes:
      - name: data-volume
        persistentVolumeClaim:
          claimName: alpha-pvc
---

4 SERVICE

We can be fast on this with one line

kubectl expose deployment alpha-xyz --type=ClusterIP --name=alpha-svc --namespace=alpha --port=80 --target-port=80

5 NETWORK POLICY

Here we want to apply

over pods matching ‘alpha-xyz’ label
only for incoming (ingress) traffic
restrict it from pods labelled as ‘middleware’
over port 80

apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: restrict-inbound
  namespace: alpha
spec:
  podSelector:
    matchLabels:
      app: alpha-xyz
  policyTypes:
    - Ingress
  ingress:
    - from:
        - podSelector:
            matchLabels:
              app: middleware
      ports:
        - protocol: TCP
          port: 80

We can test now the route is closed between the external pod and the alpha-xyz

Done!

REFERENCES:

Connect to an external service on a different AKS cluster through private network

By affinitoalessandro

On 2022-02-03

In cloud, kubernetes, linux, networking, programmazione

My goal is to call a service on an AKS cluster (aks1/US) from a pod on a second AKS cluster (aks2/EU).
These clusters will be on different regions and should communicate over a private network.

For the cluster networking I’m using the Azure CNI plugin.

Above you can see a schema of the two possible ending architectures. ExternalName or ExternalIP service on the US AKS pointing to a private EU ingress controller IP.

So, after some reading and some video listening, it seemed for me that the best option was to use an externalName service on AKS2 calling a service defined in a custom private DNS zone (ecommerce.private.eu.dev), being these two VNets peered before.

Address space for aks services:
dev-vnet  10.0.0.0/14
=======================================
dev-test1-aks   v1.22.4 - 1 node
dev-test1-vnet  11.0.0.0/16
=======================================
dev-test2-aks   v1.22.4 - 1 node
dev-test2-vnet  11.1.0.0/16

After some trials I can get connectivity between pods networks but I was never able to reach the service network from the other cluster.

I don’t have any active firewall
I’ve peered all three networks: dev-test1-vnet, dev-test2-vnet, dev-vnet (services CIDR)
I’ve create a Private DNS zones private.eu.dev where I’ve put the “ecommerce” A record (10.0.129.155) that should be resolved by the externalName service

dev-test1-aks (EU cluster):

kubectl create deployment eu-ecommerce --image=k8s.gcr.io/echoserver:1.4 --port=8080 --replicas=1

kubectl expose deployment eu-ecommerce --type=ClusterIP --port=8080 --name=eu-ecommerce

kubectl apply -f https://raw.githubusercontent.com/kubernetes/ingress-nginx/controller-v1.1.1/deploy/static/provider/cloud/deploy.yaml

kubectl create ingress eu-ecommerce --class=nginx --rule=eu.ecommerce/*=eu-ecommerce:8080

This is the ingress rule:

❯ kubectl --context=dev-test1-aks get ingress eu-ecommerce-2 -o yaml
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: eu-ecommerce-2
  namespace: default
spec:
  ingressClassName: nginx
  rules:
  - host: lb.private.eu.dev
    http:
      paths:
      - backend:
          service:
            name: eu-ecommerce
            port:
              number: 8080
        path: /ecommerce
        pathType: Prefix
status:
  loadBalancer:
    ingress:
    - ip: 20.xxxxx

This is one of the externalName I’ve tried on dev-test2-aks:

apiVersion: v1
kind: Service
metadata:
  name: eu-services
  namespace: default
spec:
  type: ExternalName
  externalName: ecommerce.private.eu.dev
  ports:
    - port: 8080
      protocol: TCP

These are some of my tests:

# --- Test externalName 
kubectl --context=dev-test2-aks run -it --rm --restart=Never busybox --image=gcr.io/google-containers/busybox -- wget -qO- http://eu-services:8080
: '
    wget: cant connect to remote host (10.0.129.155): Connection timed out
'

# --- Test connectivity AKS1 -> eu-ecommerce service
kubectl --context=dev-test1-aks run -it --rm --restart=Never busybox --image=gcr.io/google-containers/busybox -- wget -qO- http://eu-ecommerce:8080
kubectl --context=dev-test1-aks run -it --rm --restart=Never busybox --image=gcr.io/google-containers/busybox -- wget -qO- http://10.0.129.155:8080
kubectl --context=dev-test1-aks run -it --rm --restart=Never busybox --image=gcr.io/google-containers/busybox -- wget -qO- http://eu-ecommerce.default.svc.cluster.local:8080
kubectl --context=dev-test1-aks run -it --rm --restart=Never busybox --image=gcr.io/google-containers/busybox -- wget -qO- http://ecommerce.private.eu.dev:8080
# OK client_address=11.0.0.11

# --- Test connectivity AKS2 -> eu-ecommerce POD
kubectl --context=dev-test2-aks run -it --rm --restart=Never busybox --image=gcr.io/google-containers/busybox -- wget -qO- http://11.0.0.103:8080
#> OK


# --- Test connectivity - LB private IP
kubectl --context=dev-test1-aks run -it --rm --restart=Never busybox --image=gcr.io/google-containers/busybox -- wget --no-cache -qO- http://lb.private.eu.dev/ecommerce
#> OK
kubectl --context=dev-test2-aks run -it --rm --restart=Never busybox --image=gcr.io/google-containers/busybox -- wget --no-cache -qO- http://lb.private.eu.dev/ecommerce
#> KO  wget: can't connect to remote host (10.0.11.164): Connection timed out
#>> This is the ClusterIP! -> Think twice!


# --- Traceroute gives no informations
kubectl --context=dev-test2-aks  run -it --rm --restart=Never busybox --image=gcr.io/google-containers/busybox -- traceroute -n -m4 ecommerce.private.eu.dev
: '
    *  *  *
    3  *  *  *
    4  *  *  *
'

# --- test2-aks can see the private dns zone and resolve the hostname
kubectl --context=dev-test2-aks run -it --rm --restart=Never busybox --image=gcr.io/google-containers/busybox -- nslookup ecommerce.private.eu.dev
: ' Server:    10.0.0.10
    Address 1: 10.0.0.10 kube-dns.kube-system.svc.cluster.local
    Name:      ecommerce.private.eu.dev
    Address 1: 10.0.129.155
'

I’ve also created inbound and outbound network policies for the AKS networks:

on dev-aks (10.0/16) allow all incoming from 11.1/16 and 11.0/16
on dev-test2-aks allow any outbound

SOLUTION: Set the LB as an internal LB exposing the external IP to the private subnet

kubectl --context=dev-test1-aks patch service -n ingress-nginx ingress-nginx-controller --patch '{"metadata": {"annotations": {"service.beta.kubernetes.io/azure-load-balancer-internal": "tr
ue"}}}'

This article is also in Medium 🙂

Seen docs:

Differences from Scrum, Lean and Disciplined Agile Delivery

By affinitoalessandro

On 2021-10-06

In devops, management, Senza categoria

So, your manager just finished a SCRUM course, because your enterprise company thinks it is the cutting-edge management process and now everything should be SCRUM or something very close…

Are you doing SCRUM?

How much time do you dedicate to sprint planning?

Do you have a fixed, cross-functional and autonomous team assigned to fixed length sprints full time?

Do you have a dedicated person for managing business requirements inside a backlog?

Are you taking short (5 min per person) daily stand-up meetings where everyone shares just the blocking points to the rest of the team and the Scrum master?

Are you sure you need Scrum?

Applying a complex methodology when you are in a deep technical depth situation will just make the things worse.
It is what Martin Fowler calls Flaccid Scrum.

In this case what you really need to do first is to increment your delivery fluency starting from practices like Continuous Delivery or applying pragmatic methodologies like Extreme Programming.

For many people, this situation is exacerbated by Scrum because Scrum is a process that’s centered on project management techniques and deliberately omits any technical practices, in contrast to (for example) Extreme Programming.
Martin Fowler

Fluent Delivering teams not only focus on business value, they realize that value by shipping as often as their market will accept it. This is called “shipping on the market’s cadence.”

Delivering teams are distinguished from Focusing teams not only by their ability to ship, but their ability to ship at will.

Extreme Programming (XP) pioneered many of the techniques used by delivering teams and it remains a major influence today. Nearly all fluent teams use its major innovations, such as continuous integration, test-driven
development, and “merciless” refactoring.

In recent years, the DevOps movement has extended XP’s ideas to modern cloud-based environments.

Comparing Scrum with Lean

So, let’s say your company’s managers already read this article and its related sources, so you’re really going fast on your CI/C processes and almost everything is versioned and monitored…

How to manage that in a big company with a lot of distributed teams?

Let’s give a fast look to Lean and then to Disciplined Agile Delivery.

SCHEDULE / TIME

Agile: fixed timeboxes and release plans are used to schedule your next activities. You need to sort your activities in order to plan your tasks by priority in a managed backlog.

Lean: the schedule can vary based on priority of the tasks exposed in a Kanban board that should be always visible by every one. No need for all the team to be full time on one task, the experts can use a divide-and-conquer approach, focusing on the most critical parts first and releasing when it is possible, following the customer Service Agreements.

SCOPE

Agile: the sprint backlog will contain the minimum scope necessary to develop the next product release

Lean: the tasks are generated by customer tickets where they specify also the urgency level.

BUDGET

Agile: ROI and Burndown charts are used to monitor budget during the project

Lean: KPI and Service Level Agreement are used to continuously check product quality and the production chain efficiency

Disciplined Agile Delivery

The Disciplined Agile Delivery (DAD) process framework is a peoplefirst,
learning-oriented hybrid agile approach to IT solution delivery. It
has a risk-value lifecycle, is goal-driven, is scalable, and is enterprise
aware.

Here the differences from Scrum, Lean and Disciplined Agile Delivery.

PEOPLE

Keep the docs at the really minimum.
The traditional approach of having formal handoffs of work products (primarily documents) between different disciplines such as requirements, analysis, design, test, and development is a very poor way to transfer knowledge that creates bottlenecks and proves in practice to be a huge source of waste of both time and money.

Teams should be cross-functional with no internal hierarchy. In Scrum for instance, there are only three Scrum team roles: Scrum Master, product owner, and team member. The primary roles described by DAD are stakeholder, team lead, team member, product owner, and architecture owner.

LEARNING

The first aspect is domain learning: how are you exploring and identifying what your stakeholders need, and
perhaps more importantly, how are you helping the team to do so?

The second aspect is process learning, which focuses on learning to improve your process at the individual, team, and enterprise levels.

The third aspect is technical learning, which focuses on understanding how to effectively work with the tools and technologies being used to craft the solution for your stakeholders.

What may not be so obvious is the move away from promoting specialization among your staff and instead fostering a move toward people with more robust skills, something called being a generalizing specialist.
Progressive organizations aggressively promote learning opportunities for their people outside their specific areas of
specialty as well as opportunities to actually apply these new skills.

HYBRID PROCESS

DAD will take elements from the other methodologies to tailor a process that best suites an enterprise agile team:

prioritized backlog from Scrum
Kanban dashboard and limit work in progress approach from Kanban (Toyota production system)
Agile way to manage data a and documents
CI/CD, TDD, collective ownership practices from Extreme Programming and DevOps

IT SOLUTIONS OVER SOFTWARE

As IT professionals we do far more than just develop software. Yes, software is clearly important, but in addressing the needs of our stakeholders we often provide new or upgraded hardware, change the business/operational processes that stakeholders follow, and even help change the organizational structure in which our stakeholders work.

Agile was created mostly by developers and consultants, we need to focus more on business needs and company processes optimizations.

Goal-Driven Delivery Lifecycle

It is a delivery process extending the Scrum one, starting from the initial vision to the release in production;
explicit phases: Inception, Construction and Transition;
- Inception: initiate team, schedule stakeholders meetings, requirements collection, architecture design, align with company policies, release planning, set up environment
- Construction: CI, CD, burndown charts, TDD, refactoring, retrospective, etc..
- Transition: delivering in production. This stage contains steps like UAT, data migration, support environment preparation, stakeholders alignment, finalize documentation and solution deployment.
put the phases in the right context: evaluate system preparation activities before development start and management of the system by other groups after the final release
explicit milestones

Conclusions

Here we have seen, shortly, the main differences from Scrum, Lean and Disciplined Agile Delivery.

DAD is a very complex process and to find out the details there is just THE book to read in the final references.

A complete enterprise delivery process is something that requires months of work by an architecture board, but the point here is how to take the right direction as soon as possible, avoiding being hypnotized by buzz-words like Scrum or thinking that we are really agile just because we do a hour stand-up meeting every morning.

Start from removing your technical depth following firmly EP and DevOps practices. Then start formalizing your process methodology and make sure every one is walking on the same path.

REFERENCES:

Disciplined Agile Delivery: A Practitioner’s Guide to Agile Software Delivery in the Enterprise by Scott W. Ambler, Mark Lines: https://www.goodreads.com/book/show/13661387-disciplined-agile-delivery
Flaccid Scrum, Martin Fowler: https://martinfowler.com/bliki/FlaccidScrum.html
How to calculate ROI: https://online.hbs.edu/blog/post/how-to-calculate-roi-for-a-project
Extreme programming: https://en.wikipedia.org/wiki/Extreme_programming
Agile fluency model: https://www.agilefluency.org/model.php

Supervised learning regression analysis on Google stocks

By affinitoalessandro

On 2016-04-05

In algoritmi, data mining, programmazione, python

Supervised learning on Google stock analysis and predictions

Abstract

We study some tech stock price through data visualization and some financial technique, focusing on those which are intended to give a sort of reliable prevision to permit brokers have a basis on which they could decide when it is the best moment to sell or buy stocks. We first analyze a year of data about the biggest companies as Amazon, Google, Apple and Microsoft but right after that we focus on Google stocks.

Next we leave the financial tools for supervised learning analysis. These machine learning processes learn a function from an input type to an output type using data comprising examples. Furthermore we’ll talk specifically of regression supervised learning, meaning that we’re interested in inferring a real valued function whose values corresponds to the mean of a dependant variable (stock prices).

We first applied linear regression on the last 6 years of Google Trends about the word ‘google’ specifically searched in the financial news domain, versus the last 6 years Google stock prices. From now on we change our feature domain with a multivariate input, i.e. we use other stock prices (AAPL, MSFT, TWTR, AMZN) to study the accuracy of others algorithms such as a multivariate linear regression, a SVR and a Random Forest.

keywords : Finance, Stock Price Analysis, MACD, Machine Learning, Linear Regression, SVR, Random Forest, Data Visualization, Python, R

Have a look at the work through nbviewer (it fails to load images from folders, to do it download the project and open it in the jupyter notebook)
go to GitHub

What to do next ?

Do you see any error? Please tell me what to correct and why;
Implement these algorithms on other stocks and compare results
Add the r sqared to the RMSE comparison
Try to predict future stocks prices instead of contemporary ones

Amazon, Apple, Microsoft and Google pairplot

Controllo automatico della connessione rimanente, per rete 3 (tre.it)

By affinitoalessandro

On 2015-10-22

In programmazione, script

Script per il controllo automatico (dalle 8:00 alle 23:00 ogni 30 min) della connessione rimanente con l’abbonamento 3. In caso il valore sia inferiore ad uno preimpostato (500MB) invia un email d’allerta. E’ necessario essere connessi con la 3.

L’unica versione funzionante è la selenium, che necessita Firefox.

Ma con qualche piccola modifica sono sicuro che riusciate ad utilizzare Chrome se preferite, o a reindirizzarlo sul sito del vostro provider.

Scarica il codice da github!

Se è effettivamente utile fatemelo sapere che si può migliorare facilmente. 🙂

Data mining – 2014 homeworks solutions

By affinitoalessandro

On 2015-03-09

In data mining, programmazione, script

Homeworks solutions (pdf + code).

Algorithm Design / Theoretical Computer Science – 2015 – Homeworks solutions

By affinitoalessandro

On 2015-01-23

In programmazione, python, script

Since often in the homeworks the questions are similar to those of the previous years, here there are my solutions.
It costed us weeks of work, too much for dying forgotten in my hard disk.

Site course

Homework 1: Solution with python scripts
Themes : Stable Matching, Greedy Algorithms, Dynamic Programming, Choosing a random value.

Homework 2 : Solution
Themes : Set Cover, partial set cover, max cover, linear programmming (LP), integer linear programming (ILP), maximum weight matching, game theory, approximation, steiner tree, minimum spanning tree.

For the Latex version of the solutions please donate and I will send it to you with joy. 🙂

Stop forgetting seen episodes

By affinitoalessandro

On 2014-08-06

In css, script, utility

I don’t have good memory.. so often I click ( and start watching ) on tv-series episodes already seen.. and it can take a while before I realize it..

That’s how I resolved :

On Chrome ( on Firefox it’s very similar)

open the page with the links
open the Developer Tools (Ctrl+Maiusc+I on Chrome) or right-click on the body of the page and select Inspect Element
click on the + on the right ( new style rule) and add these 2 blocks:

a{
background-color: white;
}

a:visited {
background-color: black;
}

If the background isn’t white change the colors as you wish. But remember that it isn’t a permanent solution, it works as long as you keep the window open!

That’s all 😉