TuxErrante

Libertà vo cercando, ch'è sì cara, come sa chi per lei vita rifiuta

Utilizing Secure Containers for Malware Analysis and Syscall Monitoring

1. Introduction

The persistent and increasingly sophisticated nature of malware poses a significant threat across the digital landscape.

Understanding the behavior of malicious software is paramount for developing effective security strategies and mitigation techniques.

Traditional methods of malware analysis, often involving virtual machines or isolated sandbox environments, can present challenges in terms of resource intensity and the potential, albeit minimized, for escape or host compromise.

Secure container technologies have emerged as a promising alternative, offering enhanced isolation capabilities that can facilitate safer and more efficient malware analysis. These technologies, including gVisor, Kata Containers, and ZeroVM, employ distinct architectural approaches to create robust execution environments for potentially harmful code. This report will delve into the capabilities of these secure container technologies, specifically in the context of malware analysis and the monitoring of system calls. The primary objectives are to compare their suitability for this use case, recommend the most appropriate option based on the need for comprehensive syscall monitoring and robust security, and provide practical guidance on setting up a secure analysis environment using Dockerfiles and essential security measures. This includes preventing privilege escalation, network access, and volume mounts, while also exploring methods to identify the endpoints and folders the malware attempts to interact with during its execution.

2. Deep Dive into Secure Container Technologies

2.1. gVisor

gVisor is presented as a platform designed to enhance container security by providing an open-source, Linux-compatible sandbox that integrates seamlessly with existing container tooling. Its architecture centers around the Sentry, a user-space kernel that intercepts and manages system calls made by the containerized application, and the Gofer, a separate user-space process responsible for handling interactions with the host filesystem.

This design aims to isolate the host operating system from the container, enabling the secure execution of untrusted code and adding a critical layer of defense against container escapes and privilege escalation vulnerabilities. The Sentry is implemented in Go, a memory-safe language, which contributes to the overall security posture. This layered approach, where the Sentry intercepts application system calls and gVisor itself is sandboxed from the host using Linux’s isolation features, significantly complicates any attempts by malware to compromise the underlying system. Furthermore, gVisor explicitly enumerates and controls the host system’s interface exposed to the Sentry, restricting the Sentry from performing many sensitive host operations, such as opening new files or creating new sockets. This significantly reduces the potential attack surface even if the Sentry were to be compromised.

When a containerized application within gVisor makes a system call, the Sentry intercepts it before it reaches the host kernel. The Sentry then determines whether to permit, deny, or emulate the call within its user space environment.

This filtering and emulation process is vital in preventing malicious or unsafe system calls from directly interacting with the host kernel, thereby establishing a secure sandbox for the container. A fundamental design principle of gVisor is that no system call is passed directly to the host; each supported call has an independent implementation within the Sentry.

This independent implementation aims to prevent identical vulnerabilities present in the host kernel from being exploitable through the gVisor sandbox, creating a distinct security boundary. While gVisor offers robust kernel isolation and a defense-in-depth architecture that makes host escape challenging, along with runtime monitoring capabilities that can integrate with threat detection tools, it does have limitations for malware analysis.

Notably, it can introduce performance overhead, especially for input/output (I/O) intensive workloads due to the Gofer process and system call emulation. Additionally, gVisor does not implement the entire Linux system call surface, which might affect the execution and analysis of some malware samples. The performance overhead associated with gVisor, particularly in file I/O and system call latency, could potentially alter the behavior and analysis of malware that heavily relies on these operations. Researchers must be aware of this potential skewing of observed behavior when using gVisor for malware analysis.

2.2. Kata Containers

Kata Containers adopts a different strategy by running each container within a lightweight Virtual Machine (VM).

This is achieved through hardware virtualization technologies such as Intel VT-x or AMD SVM, establishing a strong isolation boundary at the hypervisor level. Each Kata Container effectively boots its own minimal guest operating system kernel, separate from the host kernel, providing enhanced workload isolation and security.

The Kata runtime acts as the intermediary between the container ecosystem (e.g., Docker, Kubernetes) and the underlying hypervisor, managing the lifecycle of these lightweight VMs. The use of a dedicated guest kernel for each container in Kata Containers offers a significant security advantage against kernel-level exploits that might target the host kernel in traditional container environments. This drastically reduces the attack surface from the perspective of the containerized malware. Furthermore, Kata Containers’ adherence to industry standards like the Open Container Initiative (OCI) container format and the Kubernetes Container Runtime Interface (CRI) ensures seamless integration with existing container management tools and workflows. This makes it easier to adopt for malware analysis without requiring significant infrastructure modifications. By leveraging hardware virtualization, Kata Containers provides strong isolation of network, I/O, and memory for each container. This ensures that even if malware manages to gain elevated privileges within its container, it remains confined within the boundaries of the lightweight VM and cannot directly access or interfere with the host system or other containers running on the same host. The hardware-enforced isolation in Kata Containers offers a more robust security boundary compared to the software-based isolation mechanisms found in traditional containers and even gVisor. This makes it a particularly appealing option for analyzing high-risk or sophisticated malware. Kata Containers’ strengths for malware analysis include its robust workload isolation, enhanced security due to virtualization, and broad compatibility with container ecosystems and tools. It allows for the execution of standard container images with minimal modifications. Limitations include a potentially higher resource overhead compared to standard containers and gVisor, as each container necessitates its own VM and kernel. There might also be some performance overhead for network-intensive workloads due to the virtualization layer. The slightly higher resource overhead of Kata Containers might impact the density of analysis environments that can be run on a single host. Researchers need to consider the available resources when planning their malware analysis infrastructure.

2.3. ZeroVM

ZeroVM employs a distinct architecture based on Software Fault Isolation (SFI) technology derived from the Google Native Client (NaCl) project. It establishes a secure and isolated execution environment, termed a “cell,” capable of running a single thread or application. Unlike conventional virtualization approaches, ZeroVM isolates individual processes without providing a complete operating system or kernel. Its primary emphasis is on secure computation and minimizing data movement by enabling data-local computing. Security in ZeroVM is enforced through static binary validation and a highly restricted system call API. ZeroVM’s reliance on static binary validation, which occurs before execution, allows for the detection and prevention of code containing unsafe instructions without incurring runtime performance overhead for continuous monitoring. However, this also implies that the malware must be in a format that can be validated by the ZeroVM validator. The extremely limited system call API in ZeroVM, comprising only six calls (pread, pwrite, jail, unjail, fork, exit), drastically reduces the attack surface available to malware. However, this also means that any malware relying on other, unsupported system calls will likely not function correctly within the ZeroVM environment. The stark contrast between ZeroVM’s six system calls and the hundreds available in a standard Linux environment highlights its extreme focus on security through restriction. While this highly restricted API provides strong security, it significantly limits the scope of malware that can be effectively analyzed. Many common malware behaviors rely on a much broader set of system calls for file system operations, networking, and process manipulation. ZeroVM’s strengths for malware analysis lie in its very strong isolation due to SFI and the extremely limited system call API, its lightweight nature with fast startup times, and its embeddability. However, it has significant limitations for general malware analysis, including the highly restrictive system call API, the requirement for applications to be cross-compiled to NaCl using a specific toolchain, and limited language support (primarily C/C++, with some support for older versions of Python and Lua). Furthermore, the project appears to have significantly reduced activity in recent years, raising concerns about ongoing support and security updates. The requirement for cross-compilation to NaCl using a specific toolchain makes ZeroVM impractical for analyzing pre-compiled malware binaries obtained in the wild. Researchers would need the source code of the malware and the ability to compile it for the ZeroVM environment, which is rarely the case. The apparent lack of recent development and community activity around ZeroVM suggests that it might not be a reliable or actively supported platform for long-term malware analysis efforts. Security vulnerabilities discovered in the future might not be addressed promptly.

3. Comparative Analysis for Malware Testing

3.1. Evaluation of Isolation Capabilities

All three technologies provide enhanced isolation compared to standard containers. gVisor achieves strong isolation by running a user-space kernel that intercepts and emulates system calls, creating a significant barrier against host kernel exploitation. Kata Containers offers the most robust isolation by running each container within a lightweight VM with its own dedicated kernel, leveraging hardware virtualization for a strong security boundary. ZeroVM provides strong isolation through Software Fault Isolation and a highly restrictive system call API, effectively limiting the actions malware can undertake. Kata Containers’ utilization of hardware virtualization generally offers the strongest and most comprehensive isolation, making it the most resilient against potential escape attempts by sophisticated malware. Hardware-level separation provides a more fundamental security boundary than software-based techniques alone, and the dedicated kernel in each Kata Container further minimizes the risk of host kernel vulnerabilities being exploited.

3.2. Assessment of Performance Impact

The performance characteristics of each technology must be considered in relation to the type of malware being analyzed. gVisor can introduce performance overhead, particularly for I/O-intensive and syscall-heavy workloads, which might affect the real-time behavior of some malware. Kata Containers has a higher initial resource footprint due to the VMs, but once the VM is running, performance for many operations can be near-native, although network I/O might experience some overhead. ZeroVM is designed to be very lightweight with fast startup times, but its limited syscall API might prevent accurate performance analysis of malware relying on unsupported calls. For malware that is sensitive to timing or relies heavily on I/O, gVisor’s overhead might be a concern. Kata Containers offers a good balance of performance and strong isolation for a wider range of malware. ZeroVM’s limitations make it less suitable for comprehensive performance analysis. The goal is to observe the malware’s behavior as accurately as possible, and significant performance differences introduced by the isolation technology could skew the analysis results.

3.3. Ease of Monitoring Syscalls and System Activity

Kata Containers and gVisor provide relatively good options for monitoring system calls. gVisor integrates well with runtime security tools like Falco for monitoring application behavior and system calls within the sandbox. Standard Linux tools might also be usable, although the output might reflect the Sentry’s syscall handling. Kata Containers allows the use of standard Linux monitoring tools like trace and sysdig within the guest VM to observe the malware’s system call activity. ZeroVM’s highly limited syscall API simplifies monitoring to the few available calls, but traditional Linux syscall monitoring tools are not directly applicable due to its SFI nature. Kata Containers offers the most straightforward integration with standard Linux syscall monitoring tools like trace and sysdig by allowing their direct use within the isolated guest VM. This provides researchers with familiar tools for in-depth analysis. The ability to use well-established and understood tools simplifies the malware analysis process and reduces the learning curve associated with the isolation technology.

3.4. Integration with Syscall Monitoring Tools

gVisor should be compatible with trace and potentially sysdig as it aims for Linux API compatibility, though the output might reflect the Sentry’s syscall handling. Standard Linux tools like trace and sysdig can be readily used within the guest VM of Kata Containers to monitor the malware’s syscall activity. Due to its non-standard environment and limited syscalls, trace and sysdig as used on typical Linux systems will not function as expected within ZeroVM. Kata Containers offers the most direct and seamless integration with standard Linux syscall monitoring tools like trace and sysdig, enabling their use within the isolated guest VM. This provides researchers with familiar and powerful tools for detailed malware analysis.

Table 1: Comparison of Secure Container Technologies for Malware Analysis

| Feature | gVisor | Kata Containers | ZeroVM |
|—|—|—|—|
| Isolation Strength | Strong | Very Strong | Strong |
| Performance | Potential Overhead | Moderate Overhead | Very Low Overhead |
| Syscall Monitoring | Good | Excellent | Limited |
| Resource Usage | Low | Moderate to High | Very Low |
| Compatibility | High | High | Low |
| Community Support | Active | Active | Less Active |
| Suitability for Malware Analysis | Good | Excellent | Poor |

4. Selecting the Optimal Container Technology for Malware Analysis

Based on the comparative analysis, Kata Containers emerges as the most suitable technology for safely testing malware and monitoring all system calls. Its robust isolation provided by lightweight virtual machines, coupled with its compatibility with standard container images and the ability to use familiar Linux tools like trace and sysdig within the guest VM, offers the best balance of security, flexibility, and analytical power. Kata Containers’ ability to provide “the speed of containers, the security of VMs” makes it an ideal choice for malware analysis, offering a robust and isolated environment without sacrificing the ability to perform detailed inspection using standard Linux tools. The use of a dedicated guest kernel per container provides a significant security advantage, particularly for malware that might attempt to exploit kernel-level vulnerabilities. While gVisor provides good isolation and integrates with cloud-native security tools, its potential performance overhead and the fact that it doesn’t implement the full Linux syscall surface might hinder the analysis of certain types of malware. ZeroVM’s extreme restrictions and lack of recent activity make it impractical for general malware analysis. For malware analysis requiring comprehensive syscall monitoring with tools like trace and sysdig, Kata Containers provides a more directly compatible and robust environment.

graph LR
    subgraph HostOS
        HostKernel((Kernel))
        Hardware[Hardware]
    end

    subgraph KataContainers
        direction LR
        subgraph KataVM[Light VM]
            KataKernel((Kernel))
            KataContainer[Container]
        end
        Hypervisor[Hypervisor] -- uses --> Hardware
        HostKernel -- manages --> Hypervisor
        Hypervisor -- executes --> KataVM
        KataVM -- contains --> KataContainer
    end

    subgraph ZeroVM
        direction LR
        ZeroContainer[Container]
        ZeroLib[Isolated libraries]
        HostKernel -- executes --> ZeroContainer
        ZeroContainer -- uses --> ZeroLib
        ZeroLib -- interacts --> HostKernel
    end

    subgraph gVisor
        direction LR
        gVisorSentry[Kernel User-Space Sentry]
        gVisorContainer[Container]
        HostKernel -- executes --> gVisorSentry
        gVisorSentry -- simulates syscall --> HostKernel
        gVisorSentry -- contains --> gVisorContainer
    end

    style HostKernel fill:#f9f,stroke:#333,stroke-width:2px
    style Hardware fill:#ccf,stroke:#333,stroke-width:2px
    style Hypervisor fill:#aaf,stroke:#333,stroke-width:2px
    style KataKernel fill:#9cf,stroke:#333,stroke-width:2px
    style KataContainers fill:#cff,stroke:#333,stroke-width:2px
    style KataVM fill:#eee,stroke:#333,stroke-width:2px,stroke-dasharray: 5 5
    style ZeroContainer fill:#efe,stroke:#333,stroke-width:2px
    style ZeroLib fill:#aee,stroke:#333,stroke-width:2px
    style gVisorSentry fill:#faa,stroke:#333,stroke-width:2px
    style gVisorContainer fill:#fca,stroke:#333,stroke-width:2px

    HostOS -- contains --> KataContainers
    HostOS -- contains --> ZeroVM
    HostOS -- contains --> gVisor

5. Building a Secure Malware Analysis Environment

5.1. Host System Security

Ensuring the host operating system’s security is the foundational step in creating a safe malware analysis environment. The host should be hardened and kept up to date with the latest security patches to minimize vulnerabilities that malware could potentially exploit to escape the container. Minimizing the software installed on the host reduces the attack surface, limiting the number of potential entry points for malicious actors. Implementing strong access controls is crucial, including the use of strong, unique passwords and multi-factor authentication to prevent unauthorized access to the host system. Enabling a host-based firewall to restrict network access to only essential services further enhances security by limiting potential communication channels. While running the container runtime in rootless mode can add an extra layer of security by limiting the impact of a potential container escape, its compatibility with Kata Containers, which relies on virtualization, might need careful consideration. Hardening the host system provides the initial and critical layer of defense for the malware analysis environment. Even with robust container isolation, a vulnerable host could still be compromised, potentially allowing an attacker to interact with or observe the analysis environment, regardless of the container’s security.

5.2. Kata Containers Configuration

To configure Kata Containers for secure malware analysis, begin by installing the runtime following the official installation guide for your operating system. Utilizing a package manager-based installation is recommended for ease of management and updates. Consider configuring Kata Containers to use a lightweight and security-focused hypervisor like Firecracker, if your environment supports it. Firecracker is specifically designed for microVMs and prioritizes security and low overhead, making it well-suited for isolated execution environments. Next, review the configuration.toml file, typically located in /etc/kata-containers/, and adjust security-related settings according to your specific needs. Pay close attention to options concerning network configuration, storage drivers, and hypervisor parameters. For malware analysis, it is advisable to explicitly disable default network interfaces within the Kata VM if you intend to manage networking at the Docker or containerd level to ensure complete isolation. The configuration.toml file also offers options for enabling debugging, which can be valuable during the initial setup and troubleshooting phases of your analysis environment. Carefully configuring the Kata Containers runtime, particularly the selection of the hypervisor and the settings within configuration.toml, is essential for optimizing both the security and performance of the malware analysis environment. The default settings might not be ideally suited for a highly secure malware analysis setup, so reviewing and adjusting these settings allows for fine-tuning the environment to meet specific security requirements, such as minimizing the hypervisor’s attack surface or precisely controlling resource allocation.

FeaturegVisorKata ContainersZeroVM
Isolation StrengthStrongVery StrongStrong
PerformancePotential OverheadModerate OverheadVery Low Overhead
Syscall MonitoringGoodExcellentLimited
Resource UsageLowModerate to HighVery Low
CompatibilityHighHighLow
Community SupportActiveActiveLess Active
Suitability for Malware AnalysisGoodExcellentPoor

6. Dockerfile Examples for Malware Analysis

The following Dockerfile examples provide a foundation for building secure malware analysis containers using Kata Containers. These examples emphasize using minimal base images, installing necessary monitoring tools, and running the malware as a non-privileged user.

6.1. Alpine-based Dockerfile

```Dockerfile
FROM alpine:latest

## Install necessary tools for analysis
RUN apk add --no-cache strace sysdig

# Add a non-root user for running the malware
RUN adduser -D malware_user

# Copy the malware binary into the container
COPY malware /home/malware_user/malware

# Set file permissions (make executable)
RUN chmod +x /home/malware_user/malware

# Switch to the non-root user
USER malware_user

# Set the entry point to run the malware with strace
ENTRYPOINT ["strace", "-f", "/home/malware_user/malware"]
```

This Dockerfile uses the minimal Alpine Linux image. It installs strace and sysdig for system call monitoring. A non-root user malware_user is created to run the malware, adhering to the principle of least privilege. The malware binary is copied into the container, made executable, and the container’s entry point is set to run the malware under strace for immediate syscall capture.

6.2. UBI9-mini-based Dockerfile

```Dockerfile
FROM registry.access.redhat.com/ubi9-minimal:latest

# Install necessary tools for analysis
RUN microdnf install -y strace sysdig

# Add a non-root user for running the malware
RUN useradd -m malware_user

# Copy the malware binary into the container
COPY malware /home/malware_user/malware

# Set file permissions (make executable)
RUN chmod +x /home/malware_user/malware

# Switch to the non-root user
USER malware_user

# Set the entry point to run the malware with sysdig
ENTRYPOINT ["sysdig", "syscall=*", "proc.name=malware"]
```

This example utilizes the Red Hat Universal Base Image 9 Minimal (UBI9-mini). It installs strace and sysdig using microdnf. Similar to the Alpine example, a non-root user malware_user is created, the malware is copied and made executable, and the entry point is configured to run the malware with sysdig to capture all syscalls made by the malware process.

6.3. Sysdig/Sysdig-based Dockerfile

```Dockerfile
FROM sysdig/sysdig:latest

# Add a non-root user for running the malware
RUN adduser -D malware_user

# Copy the malware binary into the container
COPY malware /home/malware_user/malware

# Set file permissions (make executable)
RUN chmod +x /home/malware_user/malware

# Switch to the non-root user
USER malware_user

# Set the entry point to run the malware
ENTRYPOINT ["/home/malware_user/malware"]
```

This Dockerfile leverages the sysdig/sysdig base image, which already includes the sysdig tool. It adds a non-root user malware_user, copies the malware binary, makes it executable, and sets the entry point to run the malware directly. You would then run this container with the Kata runtime and use sysdig on the host to monitor the container’s syscalls using filters like container.name=.
These Dockerfile examples provide a starting point for creating secure malware analysis containers. They emphasize the principles of using minimal base images, installing necessary monitoring tools, and running the malware as a non-privileged user. The Sysdig-based example offers a convenient way to integrate syscall monitoring directly into the container execution environment or to monitor from the host.

7. Implementing Security Measures within the Container

7.1. Preventing Privilege Escalation

To prevent privilege escalation within the malware analysis container, several key security measures should be implemented. Firstly, it is crucial to run the malware as a non-root user. This limits the potential damage the malware can inflict, as many system-level operations require root privileges. The Dockerfile examples provided in the previous section all incorporate this best practice by creating a dedicated non-root user for executing the malware. Secondly, unnecessary Linux capabilities should be dropped. Linux capabilities provide fine-grained control over privileged operations, and by removing those not strictly required by the malware (or the analysis tools), the attack surface is significantly reduced. This can be achieved using the –cap-drop flag when running the container. Finally, utilizing seccomp profiles to restrict the system calls the malware can make adds another critical layer of security. Seccomp (secure computing mode) allows filtering of system calls, permitting only those necessary for the application’s function. Kata Containers allows enabling seccomp within the guest VM by configuring the configuration.toml file. This ensures that the seccomp profile defined for the container is enforced within the isolated VM environment, further limiting the malware’s potential for malicious activity even if it were to gain some level of privilege. This defense-in-depth approach ensures that multiple security mechanisms are in place to prevent privilege escalation. Seccomp complements running as a non-root user and dropping capabilities by providing fine-grained control over system call usage.

7.2. Disabling Networking

Disabling networking for the malware analysis container is essential to prevent the malware from communicating externally, which could involve command-and-control activities, downloading further malicious payloads, or attempting to spread the infection. This can be achieved by configuring the container runtime to isolate network interfaces. When using Docker with the Kata runtime, networking can be explicitly disabled by using the –network none flag when running the container. This ensures that the container has no network interfaces configured within its network namespace. Kata Containers inherently provides a level of network isolation as it does not support the host network type directly. However, explicitly disabling networking at the Docker level provides an additional layer of assurance. To verify that networking is indeed disabled, tools like ip addr or ifconfig can be run within the container to confirm the absence of network interfaces beyond the loopback interface. The inherent network isolation provided by Kata Containers, combined with the explicit disabling of networking at the Docker level, ensures that the malware being analyzed cannot communicate externally, preventing command-and-control activities or the spread of infection. Isolating the network is crucial for safe malware analysis. By leveraging Kata Containers’ default network isolation and explicitly disabling it at the container runtime level, a highly controlled environment is created.

7.3. Restricting Mounted Volumes

Restricting mounted volumes is crucial to prevent the malware from accessing the host filesystem, which could lead to data exfiltration, modification of host files, or further compromise of the analysis environment. When running the malware analysis container with Kata Containers, it is essential to avoid mounting any volumes from the host system. Kata Containers by default uses a paravirtualized filesystem sharing mechanism (virtio-9p, with a move towards virtio-fs) to share container images from the host with the guest VM. However, to prevent any host filesystem access beyond the initial image, ensure that no volumes are explicitly mounted using the -v flag in Docker or similar mechanisms in other container management tools when running the container. If the malware analysis requires any form of persistent storage or the ability to read or write files, consider using temporary filesystems within the container itself. These filesystems exist only for the lifespan of the container and are isolated from the host. While Kata Containers uses filesystem sharing for images, explicitly avoiding any volume mounts for the running container is crucial to prevent the malware from interacting with the host’s persistent storage. Even though the container image itself is shared, the goal is to prevent the malware from reading or writing to any other part of the host filesystem during its execution. This is achieved by not defining any volume mounts for the container instance.

8. Monitoring Syscalls and Identifying Accessed Resources

8.1. Using trace

The trace utility is a powerful tool for capturing the system calls made by a running process. To use trace within a Kata Container, it can be included in the container image (as shown in the Alpine Dockerfile example).
When the container is run, the malware can be executed under trace using the entry point: ENTRYPOINT [“trace”, “-f”, “/home/malware_user/malware”]. The -f flag instructs trace to follow child processes, which is crucial as malware often spawns multiple processes. The output from trace will list each system call made by the malware, along with its arguments and return values. Analyzing this output can reveal the malware’s attempts to interact with the file system (e.g., open, read, write, unlink), network (e.g., socket, connect, sendto, recvfrom), and other system resources.

8.2. Using sysdig

sysdig is another excellent tool for monitoring system activity in Linux containers. It provides a more structured and filterable output compared to trace. sysdig can be installed within the container (as shown in the UBI9-mini Dockerfile example) or run on the host to monitor the Kata Container. To monitor all syscalls made by a specific process named malware within the container, the following command can be used as the entry point in the Dockerfile: ENTRYPOINT [“sysdig”, “syscall=*”, “proc.name=malware”].
Alternatively, if running sysdig on the host, you can filter by the container name: sysdig container.name= syscall=*. sysdig allows for powerful filtering based on various criteria, including process name, syscall type, arguments, and more. This makes it easier to focus on specific activities of interest. Sysdig offers powerful filtering capabilities based on process names, event types, and file/network descriptors. These filters can be used to specifically look for syscalls related to network connections (even if they fail due to networking being disabled) and file/directory access attempts. Sysdig Inspect can then be used for more in-depth post-capture analysis.

8.3. Analyzing Syscall Output 

Analyzing the potentially large output from trace or sysdig requires careful examination. Look for syscalls related to file operations (e.g., open, stat, mkdir, chmod), which can indicate the folders the malware is trying to reach or modify. Network-related syscalls (e.g., socket, connect, bind, sendto, recvfrom) can reveal the endpoints the malware is attempting to connect to, even if the network is disabled and the connections fail. Pay attention to the arguments of these syscalls, as they often contain the specific file paths or IP addresses and port numbers being targeted. Tools like grep can be invaluable for filtering the output to focus on specific syscalls or patterns of interest.8.4. Identifying Accessed Endpoints and Folders

To specifically identify the endpoints and folders the process in the container is trying to reach, focus on the network-related syscalls and file operation syscalls in the trace or sysdig output.
For network activity, the `connect` syscall is particularly informative as its arguments typically include the destination IP address and port number. Even if the connection fails due to networking being disabled, the attempt will still be logged.
Similarly, for folder access, the `open, stat, mkdir`, and related syscalls will contain the file paths the malware is interacting with. By filtering the syscall output for these specific calls and examining their arguments, a clear picture of the malware’s intended targets can be obtained. Sysdig’s rich filtering language allows for targeted monitoring of the malware’s behavior, even in the absence of successful network connections. Researchers can focus on the syscalls that indicate an attempt to reach specific endpoints or access particular files and folders. Even if network traffic is blocked, the malware will still likely make syscalls attempting to connect to specific IP addresses or domain names. Sysdig can capture these attempts, providing valuable intelligence about the malware’s targets. Similarly, file access attempts, even to non-existent paths, can reveal the malware’s intended actions.

9. Conclusion and Recommendations

In conclusion, secure container technologies offer a significant advancement in the ability to safely analyze malware. Among the options explored, Kata Containers stands out as the most suitable choice for this task due to its robust isolation provided by lightweight virtual machines, its compatibility with standard container images, and its seamless integration with familiar Linux system call monitoring tools like trace and sysdig. While gVisor offers a strong security model, its potential performance overhead and incomplete syscall coverage might limit its effectiveness for certain types of malware analysis.

ZeroVM, with its extreme restrictions and limited ecosystem, is not well-suited for general-purpose malware analysis.

For safely testing malware in secure containers, the following best practices should be adhered to:

  • Harden the host system: Ensure the host OS is up-to-date, has minimal software installed, and is protected by strong access controls and a firewall.
  • Use Kata Containers with a lightweight hypervisor: Configure Kata Containers to utilize a security-focused hypervisor like Firecracker for enhanced isolation.
  • Build minimal container images: Use minimal base images and install only the necessary tools for analysis.
  • Run as a non-root user: Always execute the malware under a non-privileged user account within the container.
  • Drop unnecessary capabilities: Remove any Linux capabilities that are not strictly required for the malware or analysis tools to function.
  • Utilize seccomp profiles: Implement seccomp profiles to restrict the system calls the malware can make.
  • Disable networking: Isolate the container from any network connectivity to prevent external communication.
  • Restrict volume mounts: Avoid mounting any host volumes into the container to prevent access to the host filesystem.
  • Monitor syscalls: Employ tools like trace or sysdig to capture and analyze the system calls made by the malware.
  • Analyze syscall output: Carefully examine the syscall output to identify file and network activity, revealing the endpoints and folders the malware attempts to reach.

For further research and security enhancements, consider exploring the integration of additional sandboxing tools within the Kata Container’s guest VM for an extra layer of analysis and control. Investigating the use of memory analysis tools within the isolated VM could provide deeper insights into the malware’s behavior.

Developing automated scripts or tools to parse and analyze the potentially large output from trace and sysdig would greatly enhance the efficiency of the analysis process.
Staying updated on the latest security best practices and configuration options for Kata Containers and the chosen hypervisor is also crucial for maintaining a secure and effective malware analysis environment.

Finally, contributing to open-source malware analysis projects or sharing findings with the security community can help to strengthen collective defenses against evolving threats.

the Realities of VPN Security

Lately, there’s a trend for everyone (from YouTube to Twitch to whoever…) to buy the latest, cheapest VPN deal and then claim that it’s now safe.

This is why I’ve liked this video very much, even if a little bit misleading at the beginning.

I think it could be misleading since at certain point it seems like VPN are useless and insecure by design, but I guess it was not the intention of Occupy The Web to give this message intentionally.


🔒 Understanding the Realities of VPN Security 🔒

Several critical insights were shared about the limitations of VPNs for consumer security. While VPNs can hide your IP address from public Wi-Fi networks and your service provider, they are not a comprehensive solution for online privacy. Here’s why:

1️⃣ Browser Fingerprinting & Cookies: Even with a VPN, your online activities can be tracked through browser fingerprinting and cookies. For example:

  • Browser Fingerprinting: Websites can collect data such as your operating system, browser version, screen resolution, installed fonts, and even your time zone. By combining these data points, they create a unique identifier that can track you across different websites and browsing sessions[1][2].
  • Cookies: Cookies can store information about your browsing habits, login details, and preferences. Even if your IP address is hidden by a VPN, cookies can still be used to identify and track your online behavior[2].

2️⃣ Critical Vulnerabilities: VPN vendors often expose critical CVEs (Common Vulnerabilities and Exposures) that are not patched quickly enough. A notable example is the Fortinet case, where significant vulnerabilities were left unaddressed for too long (CVE Feed).

Limited Use Cases

This is where VPNs shine and the scoped surrounding where people could leverage a highest level of privacy:

  • ISP Tracking: VPNs can prevent your ISP from tracking your internet activity, but this protection is limited to your ISP only.
  • Public Wi-Fi: VPNs can prevent snooping by third parties on open Wi-Fi networks, but this is only effective against local area threats and their ISPs.
  • Workplace Security: VPNs can securely connect you to your workplace, ensuring that no one can see your activity between you and your employer’s network.
A barely readable AI image to represent split tunneling 🙂

Split Tunneling: The purpose of split tunneling is not to enhance online security but to allow access to certain apps, websites, or services at normal internet speeds while connected to a VPN.
It also enables access to VPN-unfriendly apps, such as online banking, by using and not using a VPN on a single device simultaneously. So be aware of what your current VPN settings are and in which networks you’re currently networking in a tunnel and in which ones you’re not.

Real Attack Scenarios:

Of course, Fortinet is not the only victim of VPN targeted attacks.

  • Brute Force Attacks: A recent campaign leveraged 2.8 million IP addresses to target VPN and firewall logins, attempting to breach credentials and gain unauthorized access[3][4].
  • Botnet Orchestration: Attackers use botnets to distribute login attempts across numerous IP addresses, making it difficult to block malicious traffic without affecting legitimate users[4].
  • Exploiting Unpatched Vulnerabilities: Unpatched VPN devices can be exploited to gain access to sensitive data or to use the compromised device as a proxy for further attacks[3].

While VPNs have their uses, it’s crucial to understand their limitations and use them as part of a broader security strategy.

Stay informed and stay safe! 🛡️


References

[1] 19 Browser Fingerprints That You Should Know | AdsPower

[2] Browser Fingerprinting: What It Is and How to Block It

[3] Massive Brute Force Attack Targets VPN & Firewall Logins Using 2.8 …

[4] 2025 Rising Threat: Sophisticated Brute Force Attacks Targeting VPN

The Art of Hiring in the Age of AI: A Manager’s Survival Guide

Navigating the modern hiring landscape has indeed become a complex dance, especially with the rise of AI tools that candidates use to enhance their applications (check the extra resources at the end). It’s no longer just about screening resumes; it’s about discerning genuine talent in a sea of algorithmically polished presentations.

Here’s some thought on this topic to illustrate the challenges and, more importantly, the strategies involved.

The Evolving Battlefield: AI and the Hiring Process

Let’s be realistic. Candidates are using AI. And basically that’s OK.

They’re using it to refine their resumes, tailor their cover letters, and even practice interview questions. Some might even use it during the interview itself (more on that later).
While some see this as a threat, I argue that it’s an opportunity. It forces us, as managers or senior employees, to become better interviewers, to focus on the core skills and qualities that truly matter. The old ways of hiring, where keywords and surface-level knowledge were sufficient, are no longer enough. We need to dig deeper.

Countering the “AI Advantage”: It’s Not About Detection, It’s About Discernment

Many worry about “detecting” AI-generated content.

While tools like GPTZero can be helpful, they shouldn’t be your primary focus. Think about it: even if you know a candidate used AI, does it automatically disqualify them? Perhaps they used it to overcome a language barrier or to present their skills more effectively. Our goal isn’t to police AI usage; it’s to find the best person for the job.

The strongest counter-argument to embracing this reality is the fear of being deceived. “How can we trust anything they say?” This is a valid concern, but it misses the point. We’ve always had to deal with candidates exaggerating their skills. AI just raises the stakes. The solution isn’t to try and prevent AI use, but to design our hiring process in a way that reveals true capabilities, regardless of how the candidate presents themselves initially.

Beyond the Resume: A Multi-Layered Approach

  1. The Initial Screen (Human-Powered, AI-Aware):
    The HR screen remains important, but the focus shifts. Instead of just checking boxes, HR should be trained to look for narrative flow and consistency. AI can create a perfect-looking resume, but it often struggles with the nuances of a real career journey.
    HR should ask open-ended questions about career transitions, specific projects, and challenges overcome. Look for genuine passion and articulation, not just keyword recitation.
    • Implement a “fuzzy filtering” approach where 70-80% of candidates are selected through traditional merit-based screening.
    • Randomly select an additional 20-30% from the remaining pool to introduce unpredictability and reduce systemic biases
    • Creating screening questions that require lateral thinking and cannot be easily anticipated by AI
    • Use anonymized initial screenings to reduce unconscious bias
    • Clearly communicate the screening process to candidates
  2. The Technical Gauntlet:
    Traditional coding tests are easily gamed, especially with AI assistance. Instead, focus on collaborative problem-solving.
    Have the candidate work on a realistic problem with a member of your team. Observe their thought process, their communication skills, and their ability to adapt to feedback. Don’t just look at the final solution; look at how they arrived there. Ask them to explain their choices, their reasoning, and their understanding of the trade-offs involved.
    • Develop a scoring matrix that weights factors beyond technical keywords.
      An example is given below.
    • Create “trap questions” that reveal AI-generated content by introducing subtle technical inconsistencies
    • Implement real-time problem-solving challenges that can’t be pre-prepared
  3. The Behavioral Deep Dive:
    This is where you assess the candidate’s soft skills, their cultural fit, and their potential for growth.
    Instead of generic behavioral questions, use scenario-based inquiries. Present them with a realistic workplace situation and ask them how they would handle it.
    For example: “Imagine a critical bug is discovered in production just before a major release. How do you approach the situation?” This will reveal their problem-solving skills, their communication style, and their ability to work under pressure.
  4. The “Explain It Like I’m Five” Test :
    Ask the candidate to explain a complex technical concept in simple terms. This is a great way to assess their true understanding of the subject matter. If they can’t explain it clearly and concisely, they probably don’t understand it as well as they claim.

FactorDescriptionWeight
Technical SkillsProficiency in relevant technical skills and tools. 30%
Problem-Solving AbilityAbility to approach and solve complex problems effectively. 20%
Communication SkillsClarity, conciseness, and effectiveness in verbal and written communication. 15%
Cultural FitAlignment with company values, mission, and team dynamics. 10%
Adaptability
Ability to adapt to new challenges, technologies, and environments. 10%
Emotional Intelligence
Ability to understand and manage own emotions and those of others. 10%
Learning and Growth Mindset
Demonstrated willingness and ability to learn and grow professionally.
10%

Embracing the Inevitable: AI as a Tool, Not a Threat

Instead of fearing AI, we should embrace it as a tool that can help us identify the best candidates. Think of it this way: if a candidate can use AI effectively to enhance their application, it demonstrates a certain level of technical savvy and resourcefulness. These are valuable qualities in a software engineer.

Ask them directly if they have used AI during the process and how. Or if they know about possible related security or ethical issues. Maybe the most successful prompt engineering techniques they’ve found and it will be an update opportunity for you 😉

The key is to design our hiring process in a way that goes beyond the surface level. We need to focus on the human element: critical thinking, problem-solving, communication, collaboration, and a genuine passion for technology. These are the qualities that AI can’t replicate (still), and they’re the qualities that will ultimately determine a candidate’s success.

Trust Your (educated & randomized) Gut

In the end, hiring is still a human endeavor. Don’t be afraid to trust your gut. If something feels too good to be true, it probably is. If a candidate’s answers sound too polished and rehearsed, dig deeper. Ask follow-up questions, challenge their assumptions, and see how they respond. The best candidates will be able to think on their feet, articulate their ideas clearly, and demonstrate a genuine passion for the work. Also admitting ignorance on less critical topics would be a good signal. These are the qualities that matter, regardless of how they present themselves initially.

Extra resources:

Game Theory in Engineering Management

Playing the Infinite Game

Ever wondered why some engineering teams consistently deliver amazing results while others struggle with endless firefighting?
I think the secret often lies not in individual brilliance, but in understanding the games we’re playing. Let’s see if it could make sense trying to apply game theory in engineering management.

game theory engineering management

The Management Puzzle: What Are they Really Trying to Solve?

Let’s be honest – managing engineers sometimes feels like herding very intelligent cats who speak in algorithms. The core challenges typically boil down to:

  1. How do we align individual incentives with team goals?
  2. How do we optimize for long-term success rather than short-term wins?
  3. How do we create systems that naturally promote collaboration?

If you’re nodding along, congratulations – you’ve just identified the classic elements of what game theorists call a “multiplayer dynamic game with incomplete information.”
Sounds fancy, right? Don’t worry, we’ll break it down with some real-world examples.

Dynamic, Simultaneous, Multiplayer Games with Limited Information

Let’s break this mouthful down:

  • Dynamic: The game changes as you play it (like your codebase)
  • Simultaneous: Players make moves without knowing others’ choices (like parallel development)
  • Multiplayer: Multiple stakeholders involved (developers, users, business)
  • Limited Information: Nobody has the full picture (just like real software projects!)

This is why simplistic “if-then” management strategies often fail. We’re not playing chess (perfect information) or poker (zero-sum); we’re playing something more like a massive multiplayer online game where the rules themselves evolve.

The Zero-Sum Trap

Here’s a common scenario: Your team needs to ship a critical feature. Alice is brilliant at backend work, while Bob excels at frontend. Traditional management might focus on individual metrics – lines of code, tickets closed, etc. But that’s playing a zero-sum game where Alice and Bob compete for recognition rather than collaborate for success.

Instead, let’s apply game theory principles to create a positive-sum game:

Practical Applications

1. Design Better Incentive Systems

Let’s explore some concrete system-level incentives that create positive-sum games:

The Documentation Multiplier

# Components:
# story_points_completed: Number of story points delivered in sprint (0-100)
# documentation_coverage: % of code with clear documentation (0.0-1.0)
# knowledge_sharing_index: Number of knowledge sharing activities / team size
# individual_output: Tasks completed per sprint normalized to 1.0

sprint_success = story_points_completed * documentation_coverage
team_performance = individual_output * (1 + knowledge_sharing_index)

# Example:
# A team completing 40 points with 80% documentation: 40 * 0.8 = 32
# An individual completing 10 tasks with 3 knowledge shares in a team of 5:
# 10 * (1 + 3/5) = 16

The Quality-Speed Balance

# Components:
# speed: Tasks completed per week / team capacity (0.0-1.0)
# quality: (100 - bugs_found) / 100 (0.0-1.0)
# Both metrics normalized to 1.0 for fair comparison

# Traditional approach measures them separately
performance = (speed + quality) / 2

# System thinking combines them multiplicatively
performance = speed * quality

# Example:
# Traditional: (0.9 speed + 0.5 quality) / 2 = 0.7
# System thinking: 0.9 * 0.5 = 0.45 (penalizes quality issues more)

The Learning Network Effect

# Components:
# base_productivity: Standard team velocity (story points/sprint)
# cross_trained_pairs: Number of engineer pairs who can cover each other's work
# total_possible_pairs: n*(n-1)/2 where n is team size
# technical_output: Completed features per sprint
# unique_viewpoints_contributed: Number of team members actively contributing to technical discussions

team_capability = base_productivity * (1 + cross_trained_pairs/total_possible_pairs)
innovation_score = technical_output * unique_viewpoints_contributed

# Example for a team of 5:
# Base productivity: 30 points
# Cross-trained pairs: 6 out of 10 possible pairs
# team_capability = 30 * (1 + 6/10) = 48 points
# 
# If they complete 5 features with all 5 members contributing:
# innovation_score = 5 * 5 = 25

This multiplicative approach to metrics creates several beneficial effects:

  1. Zero in any component drastically reduces overall performance
  2. Improvements in any area benefit the entire system
  3. Maximum performance requires balanced optimization across all factors

These formulas can be adjusted based on team size and organizational priorities, but the key is maintaining the multiplicative relationship between components. This ensures that no single metric can be optimized at the expense of others without harming overall performance.

2. Understanding Reputation Formation

In game theory, reputation isn’t just about being “good” or “bad” – it’s a complex dynamic system that evolves through repeated interactions. Think of it as your team’s internal cryptocurrency: every interaction either mines or burns tokens of trust.

Consider how senior engineers build reputation: they don’t just write good code; they create environments where others can write good code. This is what game theorists call “reputation externalities” – your actions don’t just affect your reputation, but influence how others build theirs.

The Trust Equation

Research in game theory suggests that trust can be modeled as:

# Components:
# credibility: Track record of making good on promises (0-1)
# reliability: Consistency of behavior over time (0-1)
# intimacy: Psychological safety created in team (0-1)
# self_orientation: Focus on self vs team (-1 to 1)

Trust = (credibility + reliability + intimacy) / (1 + self_orientation)

# Example:
# High performing manager: (0.9 + 0.8 + 0.7) / (1 + 0.2) = 2.0
# Self-centered manager: (0.9 + 0.8 + 0.7) / (1 + 0.8) = 1.33

Good managers understand that reputation isn’t built through grand gestures but through consistent behavior in small interactions. Here’s how to play this game effectively:

  1. Information Symmetry
  • Share context behind decisions
  • Admit when you don’t know something
  • Be transparent about constraints and trade-offs
  1. Repeated Games
  1. Signal Amplification
  • Public recognition of team contributions
  • Taking responsibility for failures
  • Defending team interests in organizational politics

Remember: in game theory, the most successful strategies in repeated games tend to be “nice, forgiving, and non-envious.” This translates to management as:

  • Nice: Start with trust and support
  • Forgiving: Allow for mistakes and learning
  • Non-envious: Celebrate team success over personal recognition

3. The Game Within The Game

Here’s a secret about game theory: there’s always a bigger game. That bug fix you’re working on? It’s part of a feature. That feature? Part of a product. That product? Part of a company’s vision to change the world.

Understanding this “nested games” concept is crucial for motivation. When engineers see how their daily work connects to larger outcomes, they’re more likely to make decisions that optimize for the bigger game. It’s the difference between laying bricks and building a cathedral.


But Wait… What About Reality?

Now, I hear you thinking: “This all sounds great in theory, but what about…”

The Counter-Arguments

  1. “Individual brilliance matters!” Yes, but brilliant individuals in toxic systems produce suboptimal results. Think of it like a Formula 1 car with square wheels.
  2. “We need to move fast!” Short-term thinking is exactly what game theory helps us avoid. Remember, every game theorist’s favorite game is the infinite game.
  3. “Our industry is naturally competitive!” Perfect! That means understanding game theory gives you a competitive advantage.

Bringing It All Together

The key to successful engineering management isn’t just understanding technology or people – it’s understanding the games we create and play every day. By applying game theory principles, we can build systems that:

  • Naturally align individual and team incentives
  • Promote long-term thinking
  • Turn potential competitors into collaborators

Want to Dive Deeper?

  1. “The Evolution of Trust” – Nicky Case’s interactive guide (2020)
  2. Trust beyond reason – drmaciver.com
  3. MIT OpenCourseWare: Game Theory
  4. “Winning at Game Theory with The Price is Right” – John Farrier (2023)
  5. Strategy and Game Theory for Management by IIMA

Review: The Almanack of Naval Ravikant

“The Almanack of Naval Ravikant” is a treasure full of wisdom from one of the most influential entrepreneurs and thinkers of our time.

The book, curated by Eric Jorgenson, is a compilation of Naval’s thoughts on wealth, happiness, and life, distilled from his tweets, podcasts, and interviews. As someone who has taken extensive notes on this book, I am eager to share a deep, thoughtful review that highlights its key themes and my personal reflections.

Definitely my best read this year. 📚

Introduction to Naval’s Philosophy

Naval Ravikant, co-founder of AngelList (between many other things), is known for his incisive thinking on startups, investing, and personal growth. “The Almanack” presents his philosophy in a structured manner, covering a wide range of topics from wealth creation to achieving happiness. The book is divided into two main sections: “Wealth” and “Happiness,” each offering profound insights and actionable advice.

Wealth Creation: A Modern Approach

Naval’s approach to wealth creation is unconventional yet practical. He emphasizes the importance of building specific knowledge, leveraging technology, and owning equity in scalable businesses. According to Naval, wealth is not about renting out your time but rather creating assets that work for you even when you’re not actively involved.

  1. Specific Knowledge: Naval argues that developing unique skills that cannot be easily replicated is crucial for long-term success. This involves deep diving into areas you are passionate about and continuously honing your expertise.
  2. Leverage: Utilizing tools such as capital, code, and media to amplify your efforts is another cornerstone of Naval’s philosophy. He believes that with the right leverage, one can achieve outsized returns on investment.
  3. Ownership: Owning equity in businesses or other assets is vital for generating wealth. Naval stresses the importance of taking ownership roles rather than being an employee who trades time for money.

Happiness: Beyond the Pursuit of Wealth

Naval’s thoughts on happiness revolve around finding inner peace and contentment rather than pursuing external achievements. He advocates for mental clarity, mindfulness, and detachment from material possessions as pathways to lasting happiness.

  1. Inner Peace: Naval suggests that true happiness comes from within and is achieved by understanding and controlling one’s mind. Meditation, self-reflection, and gratitude are some practices he recommends.
  2. Mindfulness: Being present in the moment and fully engaging with the here and now is crucial for happiness. Naval emphasizes the importance of mindfulness in reducing stress and enhancing life satisfaction.
  3. Detachment: Letting go of material possessions and external validation is another key aspect. Naval argues that our constant pursuit of more can lead to dissatisfaction and that true happiness lies in appreciating what we already have.

“The Almanack of Naval Ravikant” offers valuable insights into both wealth creation and the pursuit of happiness. However, integrating these two aspects of life requires a nuanced approach that Naval’s book only partially addresses. Balancing high ambition with inner peace is a challenge that each individual must navigate based on their unique circumstances and values.

Personal Reflections and Additional Arguments

Balancing Wealth and Happiness

While the principles outlined in the wealth section are compelling, there is a nuanced balance to be struck between maximizing productivity and maintaining personal well-being. Naval’s emphasis on optimizing hourly rates to $5,000 or more could seem at odds with a peaceful, non-competitive life. Striking a balance between high productivity and personal fulfillment, particularly in family life, is critical but not thoroughly addressed.

Role Models of Happiness

From my point of view characters like Mary Poppins, Pippi Longstocking, and Leslie Knope illustrate that true happiness often involves an active engagement with life, a sense of purpose, and a positive influence on others. Mary Poppins embodies joy and wisdom while making the world better around her. Pippi Longstocking’s independence and zest for life show a child’s innocent happiness, and Leslie Knope from “Parks and Recreation” combines ambition with a genuine care for her community.
I struggle to see happiness in a completely peaceful person without some positive chaos in the mix.

The Role of Anger

Naval mentions that “anger is a precursor to violence” and advises against it. He promotes calm and rationality as the optimal emotional states for making decisions and interacting with others.

I do not fully agree with this vision.

Arun Gandhi, in “The Gift of Anger,” explores the notion that anger, when managed and harnessed properly, can be a powerful force for positive change. Here are some insights from his book:

  1. Channeling Anger: Arun Gandhi emphasizes the importance of recognizing anger as a natural emotion and learning to channel it constructively. He shares personal anecdotes of how his grandfather, Mahatma Gandhi, taught him to transform anger into positive action rather than letting it lead to destructive behaviors.
  2. Empathy and Understanding: Gandhi advocates for using anger to foster empathy and understanding. He believes that by understanding the root causes of anger, we can address underlying issues and promote healing and reconciliation.
  3. Constructive Action: The book highlights various examples of individuals who have used their anger to drive social change. From civil rights movements to personal growth, Gandhi illustrates how anger can be a catalyst for constructive action when guided by principles of non-violence and compassion.

Random thoughts

  1. Happiness is Multifaceted: True happiness encompasses more than just inner peace;
    it involves an active pursuit of passions and purpose. A person who is deeply broken can find peace momentarily, such as meditating by a lake, but still be depressed.
    True happiness, in my view, involves an active pursuit of one’s passions and purpose, going through life with a sense of fulfillment and not being overly concerned with potential or real issues.
  2. Controlled Anger as a Motivator: While uncontrolled anger can indeed lead to negative outcomes, controlled anger can also be a powerful motivator. When harnessed properly, anger can provide the energy and drive needed to overcome obstacles and push through difficult times.
  3. His obsession about studying and continuous learning has hit me hard since I do totally agree with that, as far as you get enough sleep and social life (which he hasn’t). My next studies indeed will probably be about communications basics, microeconomics and game theory.

Despite some inconsistencies, the book remains a highly influential guide for those seeking to enhance their financial well-being and personal fulfillment. Naval’s wisdom encourages readers to think deeply about their lives, make informed decisions, and pursue paths that lead to both success and contentment.

References

Resetting Your Mind

Post-Vacation Guide to Self-Improvement

-> Download here the Google doc template for the daily journaling!
[ENG] [ITA]


Introduction: Embarking on a Journey of Self-Discovery

As the golden hues of summer fade into the crisp embrace of autumn, we find ourselves at a pivotal juncture—a moment ripe for introspection and personal evolution. 🚀

This comprehensive guide invites you to embark on an odyssey of self-improvement, delving deep into the labyrinth of the human psyche.

We’ll navigate the treacherous waters of cognitive biases, scale the peaks of mindfulness, and unearth the hidden treasures of self-awareness.

By the end of this journey of self-discovery, you’ll be equipped with a cartographer’s precision in mapping out your path to a more fulfilling life.


I. The Power of Self-Reflection

A. The Illusion of Control

Imagine, if you will, a masterful puppeteer, deftly manipulating the strings of a marionette.
Now, picture yourself as both the puppeteer and the puppet—a paradox that encapsulates the human condition since the times we’ve started to think about ourselves.
We often fancy ourselves as the sole authors of our thoughts and actions, the puppeteers of our destiny. However, recent forays into the realm of cognitive science have revealed a more complex narrative.

Our minds, it seems, are less like well-oiled machines and more like eccentric artists, prone to flights of fancy and irrational brushstrokes. This tendency to deviate from the path of pure logic is what psychologists term “cognitive biases“—mental shortcuts that, while often useful, can lead us astray in a labyrinth of misperception.

Dual nature of the brain

Enter Daniel Kahneman, the Virgil to our Dante in this cognitive underworld. His seminal work, “Thinking, Fast and Slow,” illuminates the dual nature of our thought processes:

  1. System 1: The impulsive artist, flinging paint on the canvas of our consciousness with reckless abandon. Quick, intuitive, and emotionally charged, this system is the wellspring of our gut reactions and instinctive responses.
  2. System 2: The meticulous critic, scrutinizing every brushstroke with a discerning eye. Slow, deliberate, and analytical, this system is responsible for our more considered judgments and rational decision-making.

Understanding this cognitive duality is akin to gaining x-ray vision into the inner workings of our minds. It allows us to recognize when System 1 might be leading us down a primrose path of bias, and when it’s time to summon the more methodical System 2 to the fore.

However, let us not fall into the trap of oversimplification. Kahneman’s later work, “Noise,” introduces a new character to this cognitive drama—the concept of “noise” in decision-making. Picture a group of well-intentioned judges, all faced with the same case. Despite their expertise, their judgments may vary wildly due to factors as capricious as their mood or the weather. This variability, this “noise,” can lead to a cacophony of inconsistent and potentially unfair decisions.

Moreover, the scientific community, ever vigilant, has raised eyebrows at some of the experiments presented in “Thinking, Fast and Slow.” The specter of irreproducibility looms, casting shadows of doubt on the generalizability of certain findings. Yet, like a controversial masterpiece in an art gallery, Kahneman’s work continues to provoke thought and inspire further exploration of the human mind.

B. The Importance of Mindfulness

In the bustling marketplace of our minds, where thoughts jostle for attention and emotions cry out their wares, mindfulness emerges as a serene oasis. It is the practice of becoming a neutral observer to the carnival of our inner experience, paying attention to the present moment with the impartiality of a scientist and the wonder of a child.

Jon Kabat-Zinn, a modern-day alchemist in the realm of mental well-being, has distilled the ancient wisdom of mindfulness into a potent elixir for contemporary ailments. His mindfulness-based stress reduction (MBSR) techniques offer a beacon of hope in the stormy seas of modern life.

Imagine mindfulness as a skilled gardener, tenderly cultivating the soil of your consciousness. With patient attention, it can:

  1. Prune away the overgrown vines of stress and anxiety
  2. Nurture the delicate blossoms of emotional intelligence
  3. Fortify the roots of resilience against life’s tempests
  4. Create fertile ground for creativity and insight to flourish

By developing this inner garden, we create a sanctuary where we can retreat from the cacophony of automatic thoughts and knee-jerk reactions. Here, in this cultivated space of awareness, we can observe our cognitive biases with clarity and compassion, gently redirecting our mental energies towards more fruitful paths.

[Italian]

II. Practical Exercises for Self-Awareness

A. Journaling: The Cartography of the Soul

Picture yourself as an intrepid explorer, charting the vast and often mysterious terrain of your inner world. Your journal is your map, your compass, and your field notes all in one. With each entry, you’re not just recording events; you’re documenting the contours of your psyche, the climate of your emotions, and the flora and fauna of your thoughts.

Consider these journaling prompts as your expedition gear:

  1. “What unexpected discovery did I make about myself today?”
  2. “If my emotions were weather patterns, what’s the forecast for today, and why?” 🌦️
  3. “What cognitive bias might be influencing my current perspective on …?”
  4. “If I could have a conversation with my future self, what advice would they give me?”

As you traverse this inner landscape day by day, patterns will emerge like constellations in the night sky, guiding you towards deeper self-understanding.

B. Mindfulness Meditation

⏸️ As I’ve anticipated in the previous chapter, in our “modern” life, mindfulness meditation is akin to finding the pause button on reality.
It’s a practice that invites you to step off the treadmill of constant doing and into the realm of simply being.

Begin your meditation journey with the curiosity of a novice and the patience of a sage:

  1. Start small:
    Even five minutes of focused breathing can be a revolutionary act in a world that demands constant attention.
  2. Use guided resources:
    Apps like Headspace or Calm can be like having a meditation sherpa, guiding you through the initial foothills of practice.
  3. Embrace imperfection:
    Your mind will wander. That’s not failure; it’s part of the process. Each time you notice and gently return to your breath, you’re strengthening your mindfulness muscles.

Remember, the goal isn’t to achieve a blank mind—that’s as impossible as trying to empty the ocean. Instead, you’re learning to surf the waves of your thoughts rather than being tossed about by them.

C. Cognitive Behavioral Therapy

Imagine your mind as a vast computer network. CBT is like a sophisticated debugging program, helping you identify and rewrite faulty code in your mental software. It’s a collaborative process between you and a trained therapist, aimed at uncovering the hidden scripts that drive your thoughts, emotions, and behaviors.

Key CBT techniques include:

  1. Thought records: Documenting your automatic thoughts and examining the evidence for and against them.
  2. Behavioral experiments: Testing the validity of your beliefs through real-world actions.
  3. Cognitive restructuring: Learning to reframe negative thought patterns into more balanced, realistic perspectives.

While CBT can be particularly transformative for those grappling with anxiety or depression, its principles can benefit anyone seeking to optimize their mental processes. It’s like upgrading your internal operating system to run more smoothly and efficiently.

III. Overcoming Cognitive Biases

Try to fight your biases everyday. Here’s a few between the most commons.

A. Confirmation Bias

Imagine you’re an art collector with a predilection for impressionist paintings.
You’ve just acquired what you believe to be a lost Monet. Naturally, you seek out experts who specialize in impressionism, read articles about Monet’s techniques, and surround yourself with other Monet enthusiasts. But what if, in your zeal, you’ve overlooked crucial evidence that your painting is actually a skilled forgery?

This is confirmation bias in action—our tendency to seek out information that confirms our existing beliefs while ignoring or discounting contradictory evidence. It’s like wearing rose-colored glasses that filter out any hues that don’t match our preconceptions.

To combat this bias:

  1. Play devil’s advocate with yourself. For every belief you hold, challenge yourself to find three pieces of credible evidence that contradict it.
  2. Engage in structured debates where you must argue for positions you disagree with. This exercise in intellectual empathy can broaden your perspective.
  3. Cultivate a diverse network of friends and colleagues who will challenge your views respectfully but firmly.

Remember, the goal isn’t to abandon your beliefs, but to hold them with an open hand rather than a clenched fist.

B. Availability Heuristic

Picture yourself as the director of your own mental news network. The availability heuristic is like a sensationalist news anchor, giving disproportionate airtime to stories that are vivid, recent, or emotionally charged, regardless of their actual frequency or importance.

For instance, after watching a documentary about shark attacks, you might overestimate the likelihood of being bitten by a shark, even though you’re statistically more likely to be injured by a vending machine.

To counteract this bias:

  1. Become a data detective. Before making judgments about likelihood or frequency, seek out hard data and statistics from reliable sources.
  2. Practice perspective-taking. Ask yourself, “If I were from a different background or lived in a different part of the world, how might my perception of this issue change?”
  3. Keep a “surprise journal” where you record events or information that contradict your expectations. This can help calibrate your intuitive sense of probability.

C. Anchoring Bias

Imagine you’re at an auction, and the first item up for bid is a rare book. The auctioneer starts the bidding at $1000. Suddenly, that number becomes a mental anchor, influencing how you value not just that book, but potentially every item that follows.

The anchoring bias is like a stubborn boat anchor, holding our judgments in place even when we should be drifting towards a more accurate assessment. It’s particularly insidious in negotiations, where the first number mentioned can disproportionately influence the final outcome.

To weigh anchor and sail towards more accurate judgments:

  1. Before entering any situation involving numerical estimates or negotiations, decide on your own values or ranges independently.
  2. Practice generating multiple reference points. If you’re estimating the cost of a project, for example, break it down into smaller components and estimate each separately before summing them up.
  3. Seek out diverse perspectives before making a decision. Each new viewpoint can serve as a potential alternative anchor, reducing the pull of any single reference point.

If this topic intrigues you, in my notes and even more on FS blog you’ll find a very extended list.

IV. Building a Mindful Lifestyle

A. Incorporate Mindfulness into Daily Activities

Mindfulness need not be confined to the meditation cushion. In fact, the real magic happens when we infuse our daily activities with present-moment awareness. This is the alchemy of turning mundane tasks into opportunities for insight and growth.

Consider these mindful twists on everyday activities:

  1. Mindful Eating: Transform your meals into a sensory symphony. Notice the colors on your plate, inhale the aromas, savor each texture and flavor. Eating becomes not just fueling, but a celebration of the senses.
  2. Mindful Walking: Whether it’s a forest trail or a city sidewalk, walk as if you’re discovering the world for the first time. Feel the ground beneath your feet, the rhythm of your breath, the play of light and shadow around you.
  3. Mindful Listening: In conversations, practice giving your full attention to the speaker. Notice not just their words, but their tone, body language, and the emotions underlying their message. You might be surprised at how much more you hear when you’re truly listening.
  4. Mindful Creation: Whether you’re coding, cooking, or crafting, bring full awareness to the process. Notice the sensations in your body, the thoughts that arise, the subtle decisions you make at each step.

By sprinkling these moments of mindfulness throughout your day, you’re not just going through the motions of life—you’re fully inhabiting each moment.

B. Connect with Nature

In our increasingly digital world, reconnecting with nature is not just a luxury—it’s a necessity for mental and emotional wellbeing. Nature, in its infinite wisdom, has much to teach us about balance, resilience, and the art of simply being.

Consider these nature-based practices:

  1. Forest Bathing: This Japanese practice, known as “shinrin-yoku,” involves immersing yourself in the atmosphere of the forest. It’s not about hiking or exercising, but about opening your senses to the natural world around you.
  2. Earthing: Also known as grounding, this practice involves direct physical contact with the Earth’s surface. Walk barefoot on grass, sand, or soil, and feel the subtle energy exchange between your body and the earth.
  3. Sky Gazing: Lie on your back and watch the ever-changing canvas of the sky. Whether it’s the drama of storm clouds or the serenity of stars, sky gazing can shift your perspective and remind you of the vastness beyond your immediate concerns.
  4. Plant Tending: Nurturing a garden or even a single houseplant can be a profound practice in patience, care, and attunement to natural rhythms.

Research suggests that these nature connections can lower cortisol levels, boost creativity, and even enhance our capacity for empathy and cooperation. In the grand tapestry of life, we are not separate from nature—we are nature, and reconnecting with the wild can be a powerful way of coming home to ourselves.

C. Practice Gratitude: The Alchemy of Appreciation

Gratitude is like a pair of magical spectacles that, once donned, transform the mundane into the miraculous. It’s the art of recognizing the gifts in our lives, both grand and subtle, and allowing that recognition to shift our entire emotional landscape.

Here are some creative ways to start cultivating a gratitude practice:

  1. Gratitude Jar: Each day, write down one thing you’re grateful for on a small slip of paper and add it to a jar. On tough days, read through some of these notes to remind yourself of life’s blessings.
  2. Photographic Gratitude: Take a photo each day of something you’re grateful for. Over time, you’ll create a visual diary of appreciation that can be powerful to look back on.
  3. Gratitude Letters: Once a month, write a detailed letter of thanks to someone who has positively impacted your life. The act of writing deepens your appreciation, and sharing the letter can create a beautiful ripple effect of positivity.
  4. Gratitude Walks: As you walk, mentally note everything you’re grateful for that you encounter—the warmth of the sun, the smile of a stranger, the convenience of sidewalks. This practice combines the benefits of nature connection, mindfulness, and gratitude.
  5. “Three Good Things” Exercise: Each night before bed, reflect on three good things that happened during the day, no matter how small. This practice has been shown to increase happiness and decrease depressive symptoms.

Remember, gratitude isn’t about ignoring life’s challenges or forcing positivity. It’s about developing a more balanced perspective that acknowledges both the difficulties and the gifts in our lives.

Gratitude practice 2.0

BUT an effective gratitude practice goes beyond simply listing things to be grateful for and involves rewiring the nervous system.

Selecting Your Story

Begin by identifying a story that resonates deeply with you. It could be a personal anecdote, a fictional tale, or a historical event. The key is that it evokes feelings of inspiration, compassion, or awe.

Creating Your Journal Entry

Once you’ve chosen your story, dedicate a page or two in your journal to explore it in detail. Consider the following prompts:

  • Express gratitude: Write about the aspects of the story that you are grateful for. What qualities or actions inspire gratitude in you?
  • Summarize the story: Briefly recount the main events and characters.
  • Identify the emotional impact: What feelings does the story evoke in you? Are there specific moments or characters that resonate particularly strongly?
  • Connect to your own experiences: How does this story relate to your own life? Are there any parallels or lessons that you can draw from it?

Conclusion: The Never-Ending Story of Growth

As we conclude this enhanced guide, remember that personal growth is not a destination but a journey—an ongoing narrative that you are constantly writing and rewriting. Like any good story, it will have its plot twists, its moments of triumph and despair, its cast of supporting characters, and its themes that evolve over time.

The practices and insights shared here are not a prescription for perfection, but rather a set of tools to help you navigate the complex terrain of your own psyche. As you implement these strategies, approach yourself with the curiosity of a scientist, the compassion of a good friend, and the patience of a wise teacher.

Remember, too, that growth often happens in the spaces between our deliberate efforts—in the quiet moments of reflection, in the unexpected challenges that push us beyond our comfort zones, and in the connections we forge with others on their own journeys.

As you move forward, carry with you the understanding that every experience, every mistake, every moment of clarity or confusion, is an opportunity for growth. Your life is a masterpiece in progress, and you are both the artist and the art.

So, as the season changes and you embark on this next chapter, do so with a heart full of curiosity, a mind open to new possibilities, and a spirit ready for adventure. The journey


#4 Sharing Friday

https://arstechnica.com/tech-policy/2024/04/google-agrees-to-delete-private-browsing-data-to-settle-incognito-mode-lawsuit/

Google has agreed to a settlement over a class-action lawsuit regarding Chrome’s “Incognito” mode, which involves deleting billions of data records of users’ private browsing activities.
The settlement includes maintaining a change to Incognito mode that blocks third-party cookies by default, enhancing privacy for users and reducing the data Google collects.


Profile-guided optimization – The Go Programming Language (golang.org)

Go: The Complete Guide to Profiling Your Code | HackerNoon

Have you already tried Go profiling with PGO?

  • More informed compiler optimizations lead to better application performance.
  • Profiles from already-optimized binaries can be used, allowing for an iterative lifecycle of continuous improvement.
  • Go PGO is designed to be robust to changes between the profiled and current versions of the application.
  • Storing profiles in the source repository simplifies the build process and ensures reproducible builds.

https://jvns.ca/blog/2024/02/16/popular-git-config-options/#commit-verbose-true

Here’s a list of useful git options that could be very useful!


https://www.srepath.com/clearing-observability-delusions/

Observability is highlighted as the fundamental practice for all other Site Reliability Engineering (SRE) areas, essential for avoiding “flying blind.”

The article discusses common misconceptions that hinder success in observability, emphasizing the need for the right mindset and avoidance of overly complex solutions.§

The shift towards event-based Service Level Objectives (SLOs) is recommended over time-based metrics, advocating for simplicity and the importance of leadership support in SLO implementation.


https://blog.plerion.com/hacking-terraform-state-privilege-escalation/

The article discusses the security risks associated with Terraform state files in DevOps, particularly when an attacker gains the ability to edit them.

It highlights that while the Terraform state should be secure and only modifiable by the CI/CD pipeline, in reality, an attacker can exploit it to take over the entire infrastructure.
The piece emphasizes the importance of securing both the Terraform files and the state files, as well as implementing measures like state locking and permission configurations to prevent unauthorized access and modifications.
It also explores the potential for attackers to use custom providers to execute malicious code during the Terraform initialization process.


https://thehackernews.com/2024/03/microsoft-confirms-russian-hackers.html

The article details a cybersecurity breach where the Russian hacker group Midnight Blizzard accessed Microsoft’s source code and internal systems.

Microsoft confirmed the breach originated from a password spray attack on a non-production test account without multi-factor authentication.

The attack, which began in November 2023, led to the theft of undisclosed customer secrets communicated via email. Microsoft has contacted affected customers and increased security measures, but the full extent and impact of the breach remain under investigation. The incident highlights the global threat of sophisticated nation-state cyber attacks.

#3 Sharing Friday

https://blog.cloudflare.com/harnessing-office-chaos

This page provides an in-depth look at how Cloudflare harnesses physical chaos to bolster Internet security and explores the potential of public randomness and timelock encryption in applications.

There is the story of Cloudflare’s LavaRand, a system that uses physical entropy sources like lava lamps for Internet security, has grown over four years, diversifying beyond its original single source.
Cloudflare handles millions of HTTP requests secured by TLS, which requires secure randomness.
LavaRand contributes true randomness to Cloudflare’s servers, enhancing the security of cryptographic protocols.


https://radar.cloudflare.com/security-and-attacks

Here’s you can find a very interesting public dashboard provided by CloudFlare showing a lot of stats about current cyber attacks


avelino/awesome-go: A curated list of awesome Go frameworks, libraries and software (github.com)

A curated list of awesome Go frameworks, libraries and software


https://www.anthropic.com/news/claude-3-family

ChatGPT4 has been beaten.

Introducing three new AI models – Haiku, Sonnet, and Opus – with ascending capabilities for various applications1.
Opus and Sonnet are now accessible via claude.ai and the Claude API, with Haiku coming soon.
Opus excels in benchmarks for AI systems.

All models feature improved analysis, forecasting, content creation, code generation, and multilingual conversation abilities.


kubectl trick of the week.

.bahsrc

function k_get_images_digests {
  ENV="$1";
  APP="$2"
  kubectl --context ${ENV}-aks \
          -n ${ENV}-security get pod \
          -l app.kubernetes.io/instance=${APP} \
          -o json| jq -r '.items[].status.containerStatuses[].imageID' |uniq -c
}

alias k-get-images-id=k_get_images_digests

Through this alias you can get all the image digests of a specific release filtering by its label and then filter for unique values

#2 Sharing Friday

News

  • Found a new security bug in Apple M-series chipset
    The article discusses a new vulnerability in Apple’s M-series chips that allows attackers to extract secret encryption keys during cryptographic operations.
    The flaw is due to the design of the chips’ data memory-dependent prefetcher (DMP) and cannot be patched directly, potentially affecting performance.
  • Redis is changing its licensing
    Redis is adopting a dual licensing model for all future versions starting with Redis 7.4, using RSALv2 and SSPLv1 licenses, moving away from the BSD license.
    Future Redis releases will integrate advanced data types and processing engines from Redis Stack, making them freely available as part of the core Redis product.
    The new licenses restrict commercialization and managed service provision of Redis, aiming to protect Redis’ investments and its open source community.
    Redis will continue to support its community and enterprise customers, with no changes for existing Redis Enterprise customers and continued support for partner ecosystem.
  • Nobody wants to work with our best engineer
    The article discusses the challenges faced with an engineer who was technically skilled but difficult to work with.
    It highlights the importance of teamwork and collaboration in engineering, emphasizing that being right is less important than being effective and considerate.

Bash

Get your current branch fast up-to-date with master with this alias

alias git-update-branch="current_branch=$(git branch --show-current); git switch master && git pull --force && git switch $current_branch && git merge master"

Software Architecture

  • Chubby OSDI paper by Mike Burrows
    and here’s their presentation on this topic
    https://www.usenix.org/conference/srecon23emea/presentation/virji

  • Chubby is intended to provide coarse-grained locking and reliable storage for loosely-coupled distributed systems, prioritizing availability and reliability over high performance.

    It has been used to synchronize activities and agree on environmental information among clients, serving thousands concurrently.

    Similar to a distributed file system, it offers advisory locks and event notifications, aiding in tasks like leader election for services like the Google File System and Bigtable.

    The emphasis is on easy-to-understand semantics and moderate client availability, with less focus on throughput and storage capacity.

    Database Simplification: It mentions the simplification of the system through the creation of a simple database using write-ahead logging and snapshotting.
  • Introduction to Google Site Reliability Engineering slides by Salim Virji
    The presentation introduces key concepts related to SRE, emphasizing the importance of automating processes for reliability and efficiency.

    It also delves into the delicate balance between risk-taking and maintaining system stability.

    Throughout the slides, the material highlights teamwork, effective communication, and the impact of individual behavior within engineering teams. Overall, the session aims to equip students with practical insights for successful SRE practices while navigating the complexities of modern software systems.

#1 Sharing Friday

Kubernetes

  • To quickly check for all images in all #pods from a specific release (eg: Cassandra operator):
kubectl get pods -n prod-kssandra-application -l app.kubernetes.io/created-by=cass-operator -o jsonpath="{.items[*].spec.containers[*].image}" | tr -s '[[:space:]]' '\n' |sort |uniq -c

AI

News

Bash

  • To generate strong random #password you don’t need online suspicious services but just old plain bash/WSL.
    This function leverages your filesystem folder /dev/urandom,
    the output is cryptographically secure and we then match only acceptable characters in a list and finally cut a 16 length string.

    Keep it with you as an alias in your .bashrc maybe 🙂
function getNewPsw(){   
  tr -dc 'A-Za-z0-9!"#$%&'\''()*+,-./:;<=>?@[\]^_`{|}~' </dev/urandom | head -c 16; echo 
}

SAFe VS Platform Engineering

I know this is a very opinionated topic and "agile coaches" everywhere are ready to fight, so I'll try to keep it short making clear this is based just on my experience and on discussions with other engineers and managers in different companies and levels.

We’re a team of Scaled Agile SRE,
Working together to deliver quality,
Breaking down silos and communication gaps,
We’re on a mission to make sure nothing lacks.

We follow the SAFe framework to a tee,
With its ARTs and PI planning, we’re not so free,
To deliver value in every sprint,
Continuous delivery is our mint.

Chorus:
Scaled Agile and SRE,
Together we achieve,
Quality and speed,
We’re the dream team.

We prioritize work and plan ahead,
Collaborate and ensure nothing’s left unsaid,
We monitor, measure, and analyze,
Our systems to avoid any surprise.

Chorus

We take ownership and accountability,
To deliver value with reliability.

Chorus

So when you need to deliver at scale,
You know who to call and who won’t fail,
Scaled Agile SRE,
Together we’re the ultimate recipe.

ChatGPT4 & me

To not make this post too verbose I’ll try to focus only on two points that I find paramount in a SRE team living in a Scaled Agile framework (SAFe) with a Kanban style approach: capacity planning and value flow.

Capacity

What is your definition of capacity?

Most of the teams don’t ask this simple question to themselves and then struggle for months to give a better planning. Is that the sum of our hours per day? Or is it calculated based on each one capacity after removing the average amount of support, maintenance, security fixes and operations emergencies?

While learning to drive, in general but even more for a motorcycle, you’re introduced to the paradoxical concept of “expect the unexpected!

Of course, this won’t save always your life but surely it can reduce a lot the probability of you having an accident. It will because you’ll stick to some best practices tested in tens of years of driving. Like to not surpass while not seeing the exit of a turn, don’t drive too close to the previous vehicle, always consider the status of the road, the surroundings and your tires before speeding up…

The good part of computer science is that you have a lot of incidents!

But this becomes a value only if you start measuring them and then learning from them.

So we should consider our work less like artistic craftsmanship and more from a statistical point of view, going back over the closed user stories and trying to get some average completion time splitting by categories (support, emergencies, toil elimination, research…)

Nobody complains!

You have now a rough estimation of how much time is spent on variable actions and maintenance, let’s say 20 hours per week.

You know also your fixed appointments will be at least 20 min per day for the daily meeting, 1 hour per week to share issues coming from development teams and 1 hour for infrastructure refinement (open tasks evaluation, innovations to adopt or to share with the team…).

Let’s say you won’t be neither on support (answering dev teams questions and providing them new resources) nor on call (supporting operations team solving emergencies).

This will give you around 40 – 20 – 1 (dailies) – 1 (weekly) – 1 (infra) – 1 (dev team weekly) – 0.5 (weekly with your manager) = 15.5 h/w of capacity, meaning 31h of capacity for the next iteration if it lasts two weeks.

Probably  less since you know you have already other two periodical useless meeting of one hour each, so let’s round to 13 h/w ≈ 150 min/day of “uninterrupted” work.

Well… actually to not get crazy and start physically fighting my hardware I need a couple of breaks, let’s say 15 min in the morning and the same in the middle of the afternoon.

That means ≈ 120 min/day of “uninterrupted” work.

Fine, I assume I can take that user story we’ve evaluated 10h with high priority for the next week and a smaller one for the next week leaving some contingency space.

We publish this results in the PI planning and to the management, and nobody complains.

Long story short: if nobody ever complains probably you’re not involving stakeholders correctly in your PI Planning or worse you’re not involving them at all!

And that’s bad.

Why are you working on those features?

Why those features exist in first place?

If your team is decoupled from the business view, are you sure that all this effort will help something? Or do you smell re-work and failure?

We should mention also that these planning didn’t leave any space for research and creative thinking. People will start solving issues quick and dirty more and more.

Yeah, I could call Moss and Roy for a good pair programming sessions since they have already solved this issue in the last iteration but… who wants another meeting? Let’s copy paste this work around and go on for now…

How much value has my work?

To measure value, we need some kind of indicator.

There are a lot of articles on cons and pros about setting metrics for our goal even before starting. Let’s say here that you want to have a few custom indicators that proves to be a good estimation based on previous experience, they should take in consideration side effects and they should be some kind of aggregated result meaning that they shouldn’t be easily hackable (working only to improve the metrics and not the quality).

Maybe we introduce general service availability and average service response time as two service level indicators (SLI).

Then we start having management working on Value Stream Analysis to understand where this values since it was requested as a new feature by the customers before the current agile train.

They succeed to reduce periodical meetings by 50% and increase 1 to 1 communication. Now dev teams are able to solve issues by themselves thanks to better documentation and run-books etc…

Conclusions

Imagine you are trying to implement a complex application in Golang, after a while you’re still failing, so you decide to switch to Java Quarkus, that you don’t know and to mess around because you heard it is easier. After a while guess what? It still doesn’t work.

The same is for the Agile frameworks. People expect them to solve stuff auto-magically, but if we don’t put effort into changing our own behavior, into measuring our-self in order to improve (and not to give our manager micromanagement power), using the latest agile methodology will never solve our Friday afternoon issues.

Sources


Implementing continuous SBOM analysis

  1. From-cves-scanners-to-sbom-generation
  2. You are here!
  3. Dependency Track – To come!

After the deep theoretical dive of the previous article let’s try to translate all that jazz in some real example and practical use cases for implementing a continuous SBOM file generation.

Verse 1)
Grype and Syft, two brothers, so true
In the world of tech, they’re both making their due
One’s all about security, keeping us safe
The other’s about privacy, a noble crusade

(Chorus)
Together they stand, with a mission in hand
To make the digital world a better place, you understand
Grype and Syft, two brothers, so bright
Working side by side, to make the world’s tech just right

(Verse 2)
Grype’s the strong one, he’s got all the might
He’ll protect your data, day and night
Syft’s got the brains, he’s always so smart
He’ll keep your secrets, close to your heart

(Chorus)

ChatGPT

[Azure pipelines] Grype + Syft

Following there is a working example of a sample Azure pipeline comprehending two templates for having a vulnerabilities scanner job and a parallel SBOM generation.

The first job will leverage Grype, a known open-source project by Anchore, while for the second one we will use its brother/sister Syft.

At the beginning what we do is to make sure this become a continuous scanning by selecting pushes on master as a trigger action, for example to have it start after each merge on a completed pull request.

You can specify the full name of the branch (for example, master) or a wildcard (for example, releases/*). See Wildcards for information on the wildcard syntax. For more complex triggers that use exclude or batch, check the full syntax on Microsoft documentation.

In the Grype template we will

  • download the latest binary from the public project
  • set the needed permissions to read and execute the binary
  • check if there is a grype.yaml with some extra configurations
  • run the vulnerability scanner on the given image. The Grype databse will be updated before each scan
  • save the results in a file “output_grype”
  • use the output_grype to check if there are alerts that are at least High, if so we want also a Warning to be raised in our Azure DevOps web interface.

In the Syft template we will have a similar list of parameter, with the addition of the SBOM file format (json, text, cyclonedx-xml, cyclonedx-json, and much more).

After scanning our image for all its components we then publish the artifact in our pipeline, since probably we’ll want to pull this list from a SBOM analysis tool (i.e: OWASP Dependency-Track, see previous article).

Go to the code below. |🆗tested code |

Github Actions

In GitHub it would be even easier since Syft is offered as a service by an Anchore action.

By default, this action will execute a Syft scan in the workspace directory and upload a workflow artifact SBOM in SPDX format. It will also detect if being run during a GitHub release and upload the SBOM as a release asset.

A sample would be something like this:

name: Generate and Publish SBOM

on:
  push:
    branches:
      - main

env:
  DOCKER_IMAGE: <your-docker-image-name>
  ANCHORE_API_KEY: ${{ secrets.ANCHORE_API_KEY }}
  SBOM_ANALYSIS_TOOL_API_KEY: ${{ secrets.SBOM_ANALYSIS_TOOL_API_KEY }}

jobs:
  generate_sbom:
    runs-on: ubuntu-20.04

    steps:
    - name: Checkout code
      uses: actions/checkout@v2

    - name: Generate SBOM using Anchore SBOM Action
      uses: anchore/actions/generate-sbom@v1
      with:
        image_reference: ${{ env.DOCKER_IMAGE }}
        api_key: ${{ env.ANCHORE_API_KEY }}

    - name: Publish SBOM
      uses: actions/upload-artifact@v2
      with:
        name: sbom.json
        path: anchore_sbom.json

Code Samples

cve-sbom-azure-pipeline.yml


You like it You click it!

From CVEs scanners to SBOM generation

Example of Software Life Cycle and Bill of Materials Assembly Line

DevOps companies have always been in a constant pursuit of making their software development process faster, efficient, and secure. In the quest for better software security, a shift is happening from using traditional vulnerability scanners to utilizing Software Bill of Materials (SBOM) generation. This article explains why devops companies are making the switch and how SBOM generation provides better security for their software.

A CVE is known to all, it’s a security flaw call
It’s a number assigned, to an exposure we’ve spied
It helps track and prevent, any cyber threats that might hide!

Vulnerability scanners are software tools that identify security flaws and vulnerabilities in the code, systems, and applications. They have been used for many years to secure software and have proven to be effective. However, the increasing complexity of software systems, the speed of software development, and the need for real-time security data have exposed the limitations of traditional vulnerability scanners.

Executive Order 14028

Executive Order 14028, signed by President Biden on January 26, 2021, aims to improve the cybersecurity of federal networks and critical infrastructure by strengthening software supply chain security. The order requires federal agencies to adopt measures to ensure the security of software throughout its entire lifecycle, from development to deployment and maintenance.

NIST consulted with the National Security Agency (NSA), Office of Management and Budget (OMB), Cybersecurity & Infrastructure Security Agency (CISA), and the Director of National Intelligence (DNI) and then defined “critical software” by June 26, 2021.  

Such guidance shall include standards, procedures, or criteria regarding providing a purchaser a Software Bill of Materials (SBOM) for each product directly or by publishing it on a public website.

Object Model

CycloneDX Object Model Swimlane
SBOM Object Model

SBOM generation is a newer approach to software security that provides a comprehensive view of the components and dependencies that make up a software system. SBOMs allow devops companies to see the full picture of their software and understand all the components, including open-source libraries and dependencies, that are used in their software development process. This information is critical for devops companies to have, as it allows them to stay on top of security vulnerabilities and take the necessary measures to keep their software secure.

The main advantage of SBOM generation over vulnerability scanners is that SBOMs provide a real-time view of software components and dependencies, while vulnerability scanners only provide information about known vulnerabilities.

One practical example of a SBOM generation tool is Trivy, an open-source vulnerability scanner for container images and runtime environments. It detects vulnerabilities in real-time and integrates with the CI/CD pipeline, making it an effective tool for devops companies.

Another example is Anchore Grype, a cloud-based SBOM generation tool that provides real-time visibility into software components and dependencies, making it easier for devops companies to stay on top of security vulnerabilities.

OWASP Dependency-Track integrations

Finally, Dependency Track is another great tool by OWASP that allows organizations to identify and reduce risk in the software supply chain.
The Open Web Application Security Project® (OWASP) is a nonprofit foundation that works to improve the security of software through community-led open-source software projects.

The main features of Dependency Track include:

  1. Continuous component tracking: Dependency Track tracks changes to software components and dependencies in real-time, ensuring up-to-date security information.
  2. Vulnerability Management: The tool integrates with leading vulnerability databases, including the National Vulnerability Database (NVD), to provide accurate and up-to-date information on known vulnerabilities.
  3. Policy enforcement: Dependency Track enables organizations to create custom policies to enforce specific security requirements and automate the enforcement of these policies.
  4. Component Intelligence: The tool provides detailed information on components and dependencies, including licenses, licenses and age, and other relevant information.
  5. Integration with DevOps tools: Dependency Track integrates with popular DevOps tools, such as Jenkins and GitHub, to provide a seamless experience for devops teams.
  6. Reporting and Dashboards: Dependency Track provides customizable reports and dashboards to help organizations visualize their software components and dependencies, and identify potential security risks.

References

CKS Challenge #1

Here we’re going to see together how to solve a bugged Kubernetes architecture, thanks to a nice KodeKloud challenge, where:

  1. The persistent volume claim can’t be bound to the persistent volume
  2. Load the ‘AppArmor` profile called ‘custom-nginx’ and ensure it is enforced.
  3. The deployment alpha-xyz use an insecure image and needs to mount the ‘data volume’.
  4. ‘alpha-svc’ should be exposed on ‘port: 80’ and ‘targetPort: 80’ as ClusterIP
  5. Create a NetworkPolicy called ‘restrict-inbound’ in the ‘alpha’ namespace. Policy Type = ‘Ingress’. Inbound access only allowed from the pod called ‘middleware’ with label ‘app=middleware’. Inbound access only allowed to TCP port 80 on pods matching the policy
  6. ‘external’ pod should NOT be able to connect to ‘alpha-svc’ on port 80


1 Persistent Volume Claim

So first of all we notice the PVC is there but is pending, so let’s look into it

One of the first differences we notice is the kind of access which is ReadWriteOnce on the PVC while ReadWriteMany on the PV.

Also we want to check if that storage is present on the cluster.

Let’s fix that creating a local-storage resource:

Get the PVC YAML, delete the extra lines and modify access mode:

apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  finalizers:
  - kubernetes.io/pvc-protection
  name: alpha-pvc
  namespace: alpha
spec:
  accessModes:
  - ReadWriteMany
  resources:
    requests:
      storage: 1Gi
  storageClassName: local-storage
  volumeMode: Filesystem

Now the PVC is “waiting for first consumer”.. so let’s move to deployment fixing 🙂

https://kubernetes.io/docs/concepts/storage/persistent-volumes/#persistentvolumeclaims

https://kubernetes.io/docs/concepts/storage/storage-classes/#local


2 App Armor

Before fixing the deployment we need to load the App Armor profile, otherwise the pod won’t start.

To do this we move our profile inside /etc/app-arrmor.d and enable it enforced


3 DEPLOYMENT

For this exercise the permitted images are: ‘nginx:alpine’, ‘bitnami/nginx’, ‘nginx:1.13’, ‘nginx:1.17’, ‘nginx:1.16’and ‘nginx:1.14’.
We use ‘trivy‘ to find the image with the least number of ‘CRITICAL’ vulnerabilities.

Let’s give it a look at what we have now

apiVersion: apps/v1
kind: Deployment
metadata:
  creationTimestamp: null
  labels:
    app: alpha-xyz
  name: alpha-xyz
  namespace: alpha
spec:
  replicas: 1
  selector:
    matchLabels:
      app: alpha-xyz
  strategy: {}
  template:
    metadata:
      creationTimestamp: null
      labels:
        app: alpha-xyz
    spec:
      containers:
      - image: ?
        name: nginx

We can start scanning all our images to see that the most secure is the alpine version

So we can now fix the deployment in two ways

  • put nginx:alpine image
  • add alpha-pvc as a volume named ‘data-volume’
  • insert the annotation for the app-armor profile created before
apiVersion: apps/v1
kind: Deployment
metadata:
  creationTimestamp: null
  labels:
    app: alpha-xyz
  name: alpha-xyz
  namespace: alpha
spec:
  replicas: 1
  selector:
    matchLabels:
      app: alpha-xyz
  strategy: {}
  template:
    metadata:
      labels:
        app: alpha-xyz
      annotations:
        container.apparmor.security.beta.kubernetes.io/nginx: localhost/custom-nginx
    spec:
      containers:
      - image: nginx:alpine
        name: nginx
        volumeMounts:
        - name: data-volume
          mountPath: /usr/share/nginx/html
      volumes:
      - name: data-volume
        persistentVolumeClaim:
          claimName: alpha-pvc
---

4 SERVICE

We can be fast on this with one line

kubectl expose deployment alpha-xyz --type=ClusterIP --name=alpha-svc --namespace=alpha --port=80 --target-port=80

5 NETWORK POLICY

Here we want to apply

  • over pods matching ‘alpha-xyz’ label
  • only for incoming (ingress) traffic
  • restrict it from pods labelled as ‘middleware’
  • over port 80
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: restrict-inbound
  namespace: alpha
spec:
  podSelector:
    matchLabels:
      app: alpha-xyz
  policyTypes:
    - Ingress
  ingress:
    - from:
        - podSelector:
            matchLabels:
              app: middleware
      ports:
        - protocol: TCP
          port: 80
        

We can test now the route is closed between the external pod and the alpha-xyz

Done!


REFERENCES:

Connect to an external service on a different AKS cluster through private network

My goal is to call a service on an AKS cluster (aks1/US) from a pod on a second AKS cluster (aks2/EU).
These clusters will be on different regions and should communicate over a private network.

For the cluster networking I’m using the Azure CNI plugin.

Above you can see a schema of the two possible ending architectures. ExternalName  or ExternalIP  service on the US AKS pointing to a private EU ingress controller IP.

So, after some reading and some video listening, it seemed for me that the best option was to use an externalName service on AKS2 calling a service defined in a custom private DNS zone (ecommerce.private.eu.dev), being these two VNets peered before.

Address space for aks services:
dev-vnet  10.0.0.0/14
=======================================
dev-test1-aks   v1.22.4 - 1 node
dev-test1-vnet  11.0.0.0/16
=======================================
dev-test2-aks   v1.22.4 - 1 node
dev-test2-vnet  11.1.0.0/16 

After some trials I can get connectivity between pods networks but I was never able to reach the service network from the other cluster.

  • I don’t have any active firewall
  • I’ve peered all three networks: dev-test1-vnet, dev-test2-vnet, dev-vnet (services CIDR)
  • I’ve create a Private DNS zones private.eu.dev where I’ve put the “ecommerce” A record (10.0.129.155) that should be resolved by the externalName service

dev-test1-aks (EU cluster):

kubectl create deployment eu-ecommerce --image=k8s.gcr.io/echoserver:1.4 --port=8080 --replicas=1

kubectl expose deployment eu-ecommerce --type=ClusterIP --port=8080 --name=eu-ecommerce

kubectl apply -f https://raw.githubusercontent.com/kubernetes/ingress-nginx/controller-v1.1.1/deploy/static/provider/cloud/deploy.yaml

kubectl create ingress eu-ecommerce --class=nginx --rule=eu.ecommerce/*=eu-ecommerce:8080

This is the ingress rule:

❯ kubectl --context=dev-test1-aks get ingress eu-ecommerce-2 -o yaml
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: eu-ecommerce-2
  namespace: default
spec:
  ingressClassName: nginx
  rules:
  - host: lb.private.eu.dev
    http:
      paths:
      - backend:
          service:
            name: eu-ecommerce
            port:
              number: 8080
        path: /ecommerce
        pathType: Prefix
status:
  loadBalancer:
    ingress:
    - ip: 20.xxxxx

This is one of the externalName I’ve tried on dev-test2-aks:

apiVersion: v1
kind: Service
metadata:
  name: eu-services
  namespace: default
spec:
  type: ExternalName
  externalName: ecommerce.private.eu.dev
  ports:
    - port: 8080
      protocol: TCP

These are some of my tests:

# --- Test externalName 
kubectl --context=dev-test2-aks run -it --rm --restart=Never busybox --image=gcr.io/google-containers/busybox -- wget -qO- http://eu-services:8080
: '
    wget: cant connect to remote host (10.0.129.155): Connection timed out
'

# --- Test connectivity AKS1 -> eu-ecommerce service
kubectl --context=dev-test1-aks run -it --rm --restart=Never busybox --image=gcr.io/google-containers/busybox -- wget -qO- http://eu-ecommerce:8080
kubectl --context=dev-test1-aks run -it --rm --restart=Never busybox --image=gcr.io/google-containers/busybox -- wget -qO- http://10.0.129.155:8080
kubectl --context=dev-test1-aks run -it --rm --restart=Never busybox --image=gcr.io/google-containers/busybox -- wget -qO- http://eu-ecommerce.default.svc.cluster.local:8080
kubectl --context=dev-test1-aks run -it --rm --restart=Never busybox --image=gcr.io/google-containers/busybox -- wget -qO- http://ecommerce.private.eu.dev:8080
# OK client_address=11.0.0.11

# --- Test connectivity AKS2 -> eu-ecommerce POD
kubectl --context=dev-test2-aks run -it --rm --restart=Never busybox --image=gcr.io/google-containers/busybox -- wget -qO- http://11.0.0.103:8080
#> OK


# --- Test connectivity - LB private IP
kubectl --context=dev-test1-aks run -it --rm --restart=Never busybox --image=gcr.io/google-containers/busybox -- wget --no-cache -qO- http://lb.private.eu.dev/ecommerce
#> OK
kubectl --context=dev-test2-aks run -it --rm --restart=Never busybox --image=gcr.io/google-containers/busybox -- wget --no-cache -qO- http://lb.private.eu.dev/ecommerce
#> KO  wget: can't connect to remote host (10.0.11.164): Connection timed out
#>> This is the ClusterIP! -> Think twice!


# --- Traceroute gives no informations
kubectl --context=dev-test2-aks  run -it --rm --restart=Never busybox --image=gcr.io/google-containers/busybox -- traceroute -n -m4 ecommerce.private.eu.dev
: '
    *  *  *
    3  *  *  *
    4  *  *  *
'

# --- test2-aks can see the private dns zone and resolve the hostname
kubectl --context=dev-test2-aks run -it --rm --restart=Never busybox --image=gcr.io/google-containers/busybox -- nslookup ecommerce.private.eu.dev
: ' Server:    10.0.0.10
    Address 1: 10.0.0.10 kube-dns.kube-system.svc.cluster.local
    Name:      ecommerce.private.eu.dev
    Address 1: 10.0.129.155
'

I’ve also created inbound and outbound network policies for the AKS networks:

  • on dev-aks (10.0/16) allow all incoming from 11.1/16 and 11.0/16
  • on dev-test2-aks allow any outbound

SOLUTION: Set the LB as an internal LB exposing the external IP to the private subnet

kubectl --context=dev-test1-aks patch service -n ingress-nginx ingress-nginx-controller --patch '{"metadata": {"annotations": {"service.beta.kubernetes.io/azure-load-balancer-internal": "tr
ue"}}}'

This article is also in Medium 🙂


Seen docs:

Page 1 of 6

Powered by WordPress & Theme by Anders Norén