Thursday, February 2, 2023

Solution Architect Interview Questions

 1. What do software architects do?

Software architects are expert, professional developers who share information between software engineering teams and clients to implement precise software design solutions. Some of their primary responsibilities are:

  • Project code QA testing
  • Task distribution for software engineer teams
  • Technical standards evaluation
  • Breaking down project goals into deliverable tasks

2. What is meant by the KISS principle?

Answer:

KISS, a backronym for "keep it simple, stupid", is a design principle noted by the U.S. Navy in 1960. The KISS principle states that most systems work best if they are kept simple rather than made complicated; therefore simplicity should be a key goal in design, and that unnecessary complexity should be avoided.

Source: stackoverflow.com

3. Why is it a good idea for “lower” application layers not to be aware of “higher” ones?

Answer:

The fundamental motivation is this:

You want to be able to rip an entire layer out and substitute a completely different (rewritten) one, and NOBODY SHOULD (BE ABLE TO) NOTICE THE DIFFERENCE.

The most obvious example is ripping the bottom layer out and substituting a different one. This is what you do when you develop the upper layer(s) against a simulation of the hardware, and then substitute in the real hardware.

Also layers, modules, indeed architecture itself, are means of making computer programs easier to understand by humans.

Source: stackoverflow.com

 

4. Which technical skills are required to be a successful software architect?

As well as knowledge of unified modeling language (UML), software architects need to have skills in various programming languages. They should also understand agile management and collaboration methods so they can align development and operations.

5. Which soft skills are required to be a successful software architect?

A crucial soft skill for software architects is effective leadership, but there are other essential skills, too. Some other soft skills required to be a good software architect include:

 

6. What Is Load Balancing?

Answer:

Load balancing is simple technique for distributing workloads across multiple machines or clusters. The most common and simple load balancing algorithm is Round Robin. In this type of load balancing the request is divided in circular order ensuring all machines get equal number of requests and no single machine is overloaded or underloaded.

The Purpose of load balancing is to

  • Optimize resource usage (avoid overload and under-load of any machines)
  • Achieve Maximum Throughput
  • Minimize response time

Most common load balancing techniques in web based applications are

  1. Round robin
  2. Session affinity or sticky session
  3. IP Address affinity

Source: fromdev.com

7. What Is CAP Theorem?

Answer:

The CAP Theorem for distributed computing was published by Eric Brewer. This states that it is not possible for a distributed computer system to simultaneously provide all three of the following guarantees:

  1. Consistency (all nodes see the same data even at the same time with concurrent updates )
  2. Availability (a guarantee that every request receives a response about whether it was successful or failed)
  3. Partition tolerance (the system continues to operate despite arbitrary message loss or failure of part of the system)

The CAP acronym corresponds to these three guarantees. This theorem has created the base for modern distributed computing approaches. Worlds most high volume traffic companies (e.g. Amazon, Google, Facebook) use this as basis for deciding their application architecture. It's important to understand that only two of these three conditions can be guaranteed to be met by a system.

Source: fromdev.com

8. Define Microservice Architecture

Answer:

Microservices, aka Microservice Architecture, is an architectural style that structures an application as a collection of small autonomous services, modeled around a business domain.

Source: lambdatest.com

9. Why use WebSocket over Http?

Answer:

WebSocket is a continuous connection between client and server. That continuous connection allows the following:

1.   Data can be sent from server to client at any time, without the client even requesting it. This is often called server-push and is very valuable for applications where the client needs to know fairly quickly when something happens on the server (like a new chat messages has been received or a new price has been udpated). A client cannot be pushed data over http. The client would have to regularly poll by making an http request every few seconds in order to get timely new data. Client polling is not efficient.

2.  Data can be sent either way very efficiently. Because the connection is already established and a webSocket data frame is very efficiently organized, one can send data a lot more efficiently that via an HTTP request that necessarily contains headers, cookies, etc...

Source: stackoverflow.com

10. What do you mean by lower latency interaction?

Answer:

Low latency means that there is very little delay between the time you request something and the time you get a response. As it applies to webSockets, it just means that data can be sent quicker (particularly over slow links) because the connection has already been established so no extra packet roundtrips are required to establish the TCP connection.

Source: stackoverflow.com

11. What Is Scalability?

Answer:

Scalability is the ability of a system, network, or process to handle a growing amount of load by adding more resources. The adding of resource can be done in two ways

  • Scaling Up
    This involves adding more resources to the existing nodes. For example, adding more RAM, Storage or processing power.
  • Scaling Out
    This involves adding more nodes to support more users.

Any of the approaches can be used for scaling up/out a application, however the cost of adding resources (per user) may change as the volume increases. If we add resources to the system It should increase the ability of application to take more load in a proportional manner of added resources.

An ideal application should be able to serve high level of load in less resources. However, in practical, linearly scalable system may be the best option achievable. Poorly designed applications may have really high cost on scaling up/out since it will require more resources/user as the load increases.

Source: fromdev.com

12. Why Do You Need Clustering?

Answer:

Clustering is needed for achieving high availability for a server software. The main purpose of clustering is to achieve 100% availability or a zero down time in service. A typical server software can be running on one computer machine and it can serve as long as there is no hardware failure or some other failure. By creating a cluster of more than one machine, we can reduce the chances of our service going un-available in case one of the machine fails.

Doing clustering does not always guarantee that service will be 100% available since there can still be a chance that all the machine in a cluster fail at the same time. However it in not very likely in case you have many machines and they are located at different location or supported by their own resources.

Source: fromdev.com

13. What Is A Cluster?

Answer:

A cluster is group of computer machines that can individually run a software. Clusters are typically utilized to achieve high availability for a server software. Clustering is used in many types of servers for high availability.

  • App Server Cluster
    An app server cluster is group of machines that can run a application server that can be reliably utilized with a minimum of down-time.
  • Database Server Cluster
    An database server cluster is group of machines that can run a database server that can be reliably utilized with a minimum of down-time.

Source: fromdev.com

14. What is Domain Driven Design?

Answer:

Domain Driven Design is a methodology and process prescription for the development of complex systems whose focus is mapping activities, tasks, events, and data within a problem domain into the technology artifacts of a solution domain.

It is all about trying to make your software a model of a real-world system or process.

Source: stackoverflow.com

15. What defines a software architect?

Answer:

An architect is the captain of the ship, making the decisions that cross multiple areas of concern (navigation, engineering, and so on), taking final responsibility for the overall health of the ship and its crew (project and its members), able to step into any station to perform those duties as the need arises (write code for any part of the project should they lose a member). He has to be familiar with the problem domain, the technology involved, and keep an eye out on new technologies that might make the project easier or answer new customers' feature requests.

Source: stackoverflow.com

16. What does the expression “Fail Early” mean, and when would you want to do so?

Answer:

Essentially, fail fast (a.k.a. fail early) is to code your software such that, when there is a problem, the software fails as soon as and as visibly as possible, rather than trying to proceed in a possibly unstable state.

Fail Fast approach won’t reduce the overall number of bugs, at least not at first, but it’ll make most defects much easier to find.

Source: stackoverflow.com

17. What does “program to interfaces, not implementations” mean?

Answer:

Coding against interface means, the client code always holds an Interface object which is supplied by a factory.

Any instance returned by the factory would be of type Interface which any factory candidate class must have implemented. This way the client program is not worried about implementation and the interface signature determines what all operations can be done.

This approach can be used to change the behavior of a program at run-time. It also helps you to write far better programs from the maintenance point of view.

Source: tutorialspoint.com

18. What is Elasticity (in contrast to Scalability)?

Answer:

Elasticity means that the throughput of a system scales up or down automatically to meet varying demand as resource is proportionally added or removed. The system needs to be scalable to allow it to benefit from the dynamic addition, or removal, of resources at runtime. Elasticity therefore builds upon scalability and expands on it by adding the notion of automatic resource management.

19. Explain what test-driven development means?

Test Driven Development is a process in which you write the test before you write the code. And when all tests are passing you make the code better.You’re (re-)designing your code to make and keep it easily testable. And that makes it clean, uncomplicated, and easy to understand and change.

Benefits of Test Driven Development

1. Early bug notification.
2. Easy bug diagnosis as the tests pinpoint what went wrong.
3. It lowers the cost of enhancements. Keeping the code clean is also how you minimize the risk of accidental complications. And that means you can maintain a constant pace in delivering value.
4. With the safety net, developers are more willing to merge their changes and pull in other developer’s changes. And, they’ll do it more often. And then trunk based development and continuous integration, delivery, and deployment can really take off.
5. It lowers the number of bugs that ‘escape’ into production and that lowers support costs.

 

20. Difference Between Cloud Elasticity and Scalability :

Cloud Elasticity

Cloud Scalability

Elasticity is used just to meet the sudden up and down in the workload for a small period of time.

Scalability is used to meet the static increase in the workload.

Elasticity is used to meet dynamic changes, where the resources need can increase or decrease.

Scalability is always used to address the increase in workload in an organization.

Elasticity is commonly used by small companies whose workload and demand increases only for a specific period of time.

Scalability is used by giant companies whose customer circle persistently grows in order to do the operations efficiently.

It is a short term planning and adopted just to deal with an unexpected increase in demand or seasonal demands.

Scalability is a long term planning and adopted just to deal with an expected increase in demand.

21. What Is Backpressure?

Resistance or force opposing the desired flow of data through software.

Let's use an example to clearly describe what it is:

1.       The system contains three services: the Publisher, the Consumer, and the Graphical User Interface (GUI)

2.      The Publisher sends 10000 events per second to the Consumer

3.      The Consumer processes them and sends the result to the GUI

4.      The GUI displays the results to the users

5.      The Consumer can only handle 7500 events per second


At this speed rate, the consumer cannot manage the events (backpressure). Consequently, the system would collapse and the users would not see the results.

 

 


How you handle backpressure can pretty much be summed up with three possible options:

 

·       Control the producer (slow down/speed up is decided by consumer)

·       Buffer (accumulate incoming data spikes temporarily)

·       Drop (sample a percentage of the incoming data)

22. What is the difference between the WebSocket and REST API?

Criteria

Web Socket

REST API

Performance

WebSockets have a low overhead per message. They’re ideal for use cases that require low-latency, high-frequency communication.

REST APIs have a higher message overhead compared to WebSockets. They’re best suited for use cases where you want to create, retrieve, delete, or update resources.

Nature

Socket-based.

Resource-based.

HTTP use

WebSocket uses HTTP only during the initial request/response handshake (connection establishment).

REST uses HTTP to enable client-server communication.

Communication

Event-driven and bidirectional.

Request-driven and unidirectional.

State

WebSocket is a stateful protocol.

REST uses the HTTP protocol, which is stateless.

TCP connection

A WebSocket connection uses a single TCP connection for data exchange.

Every request/response requires a new TCP connection.

23. Explain how concurrency is different from parallelism.

Concurrency is when two or more tasks can start, run, and complete in overlapping time periods. It doesn't necessarily mean they'll ever both be running at the same instant. For example, multitasking on a single-core machine.

Parallelism is when tasks literally run at the same time, e.g., on a multicore processor.

 

Concurrency

Parallelism

Concurrency is the task of running and managing the multiple computations at the same time.

While parallelism is the task of running multiple computations simultaneously.

Concurrency is achieved through the interleaving operation of processes on the central processing unit(CPU) or in other words by the context switching.

While it is achieved by through multiple central processing units(CPUs).

Concurrency can be done by using a single processing unit.

While this can’t be done by using a single processing unit. it needs multiple processing units.

Concurrency increases the amount of work finished at a time.

While it improves the throughput and computational speed of the system.

Concurrency deals lot of things simultaneously.

While it do lot of things simultaneously.

Concurrency is the non-deterministic control flow approach.

While it is deterministic control flow approach.

In concurrency debugging is very hard.

While in this debugging is also hard but simple than concurrency.

24. Session affinity?

Session affinity is a feature available on load balancers that allows all subsequent traffic and requests from an initial client session to be passed to the same server in the pool. Session affinity is also referred to as session persistence, server affinity, server persistence, or server sticky.

25. Session Replication?

Session replication is a mechanism used to replicate the data stored in a session across different instances. However, the replicated instance must be part of the same cluster. When session replication is enabled in a cluster environment, the entire session data is copied on a replicated instance.

26. Session affinity(sticky session) vs Session Replication?

If you're using session replication without sticky session : Imagine you have only one user using your web app, and you have 3 tomcat instances. This user sends several requests to your app, then the loadbalancer will send some of these requests to the first tomcat instance, and send some other of these requests to the secondth instance, and other to the third.

If you're using sticky session without replication : Imagine you have only one user using your web app, and you have 3 tomcat instances. This user sends several requests to your app, then the loadbalancer will send the first user request to one of the three tomcat instances, and all the other requests that are sent by this user during his session will be sent to the same tomcat instance. During these requests, if you shutdown or restart this tomcat instance (tomcat instance which is used) the loadbalancer sends the remaining requests to one other tomcat instance that is still running, BUT as you don't use session replication, the instance tomcat which receives the remaining requests doesn't have a copy of the user session then for this tomcat the user begin a session : the user loose his session and is disconnected from the web app although the web app is still running.

If you're using sticky session WITH session replication : Imagine you have only one user using your web app, and you have 3 tomcat instances. This user sends several requests to your app, then the loadbalancer will send the first user request to one of the three tomcat instances, and all the other requests that are sent by this user during his session will be sent to the same tomcat instance. During these requests, if you shutdown or restart this tomcat instance (tomcat instance which is used) the loadbalancer sends the remaining requests to one other tomcat instance that is still running, as you use session replication, the instance tomcat which receives the remaining requests has a copy of the user session then the user keeps on his session : the user continue to browse your web app without being disconnected, the shutdown of the tomcat instance doesn't impact the user navigation.

27. What Is Middle Tier Clustering?

Middle tier clustering is just a cluster that is used for service the middle tier in a application. This is popular since many clients may be using middle tier and a lot of heavy load may also be served by middle tier that requires it be to highly available.

Failure of middle tier can cause multiple clients and systems to fail, therefore its one of the approaches to do clustering at the middle tier of a application.

In java world, it is really common to have EJB server clusters that are used by many clients. In general any application that has a business logic that can be shared across multiple client can use a middle tier cluster for high availability.

28. Explain high availability in the software architect field?

There are organizations that require their systems to be operational 24/7. For these organizations, HA architecture is essential. While HA does not guarantee that systems will not be hit by unplanned interruptions, it minimizes the impact of such interruptions on your operations. A more responsive system is another benefit of HA.

 

HA architecture ensures that your systems are up and running and accessible to your users in the face of unforeseen circumstances such as hardware and software failures. With it, you use multiple components to ensure continuous and responsive service.

Redundant hardware: Lack of redundant hardware means no requests can be served until a server is restarted after a crash. When this happens, downtime is inevitable. Thus, your HA architecture must include backup hardware such as servers or server clusters that take over automatically in case of production hardware crashes.

Redundant software and applications: To prevent potential downtime whenever there are failures in the software and applications used in your production environment, it is crucial that your HA architecture includes backup software and applications.

Redundant data: Database servers that go offline for one reason or another can wreak havoc on your production environment. Your HA architecture should include provisions for backup database servers to which processing can be shifted whenever a production database server goes offline.

No single point of failure: A failure in a single component should not crash your entire infrastructure. With redundancy in hardware, software, and data, single points of failure are eliminated.

29. What does fault tolerance mean?

Fault tolerance is a process that enables an operating system to respond to a failure in hardware or software. This fault-tolerance definition refers to the system’s ability to continue operating despite failures or malfunctions.

An operating system that offers a solid definition for faults cannot be disrupted by a single point of failure. It ensures business continuity and the high availability of crucial applications and systems regardless of any failures.

30. How Does Fault Tolerance Work?

Fault tolerance can be built into a system to remove the risk of it having a single point of failure. To do so, the system must have no single component that, if it were to stop working effectively, would result in the entire system failing.

Fault tolerance is reliant on aspects like load balancing and failover, which remove the risk of a single point of failure. It will typically be part of the operating system’s interface, which enables programmers to check the performance of data throughout a transaction.

31. What does fault resilience mean?

Fault resilience: Failure is observed in some services. But rest of system continues to function normally.

Resilience means how many faults the system can tolerate.

32. What is the DRY principle?

The DRY (don't repeat yourself) principle is a best practice in software development that recommends software engineers to do something once, and only once. The goal of the DRY principle is to lower technical debt by eliminating redundancies in process and logic whenever possible.

 

Redundancies in process

To prevent redundancies in processes (actions required to achieve a result), followers of the DRY principle seek to ensure that there is only one way to complete a particular process. Automating the steps wherever possible also reduces redundancy, as well as the number of actions required to complete a task.

Redundancies in logic

To prevent redundancies in logic (code), followers of the DRY principle use abstraction to minimize repetition. Abstraction is the process of removing characteristics until only the most essential characteristics remain.

An important goal of the DRY principle is to improve the maintainability of code during all phases of its lifecycle. When the DRY principle is followed, for example, a software developer should be able to change code in one place, and have the change automatically applied to every instance of the code in question. 

33. What is the DIE principle?

DIE in software development is an acronym that means “duplication is evil.” The DIE principle is used in the same situations as the DRY principle and aims to ensure that software architects and developers avoid duplicating concepts. It also contributes to efficient code maintainability.

34. Explain the ACID acronym?

Atomicity

All changes to data are performed as if they are a single operation. That is, all the changes are performed, or none of them are.

For example, in an application that transfers funds from one account to another, the atomicity property ensures that, if a debit is made successfully from one account, the corresponding credit is made to the other account.

Consistency

Data is in a consistent state when a transaction starts and when it ends.

For example, in an application that transfers funds from one account to another, the consistency property ensures that the total value of funds in both the accounts is the same at the start and end of each transaction.

Isolation

The intermediate state of a transaction is invisible to other transactions. As a result, transactions that run concurrently appear to be serialized.

For example, in an application that transfers funds from one account to another, the isolation property ensures that another transaction sees the transferred funds in one account or the other, but not in both, nor in neither.

Durability

After a transaction successfully completes, changes to data persist and are not undone, even in the event of a system failure.

For example, in an application that transfers funds from one account to another, the durability property ensures that the changes made to each account will not be reversed.

35. You Aren't Gonna Need It (YAGNI)?

You Aren't Gonna Need It (YAGNI) is an Extreme Programming (XP) practice which states: "Always implement things when you actually need them, never when you just foresee that you need them."

Even if you're totally, totally, totally sure that you'll need a feature, later on, don't implement it now. Usually, it'll turn out either:

you don't need it after all, or

what you actually need is quite different from what you foresaw needing earlier.

This doesn't mean you should avoid building flexibility into your code. It means you shouldn't overengineer something based on what you think you might need later on.

 

There are two main reasons to practice YAGNI:

·       You save time because you avoid writing code that you turn out not to need.

·       Your code is better because you avoid polluting it with 'guesses' that turn out to be more or less wrong but stick around anyway.

36. Explain what SOLID means.

The SOLID acronym features five principles for software architect and development roles. These principles are:

 

Single responsibility: "There should never be more than one reason for a class to change." In other words, every class should have only one responsibility.

Open/closed: The open/closed principle indicates that although a module or class should be open for extension, it should be closed for modification.

Liskov substitution: "Functions that use pointers or references to base classes must be able to use objects of derived classes without knowing it".

Interface segregation: "Many client-specific interfaces are better than one general-purpose interface."

Dependency inversion: The dependency inversion principle suggests that a high-level class shouldn’t rely on a low-level class, though both can depend on high-level abstractions, [not] concretions.

37. Difference between Shared Nothing Architecture and Shared Disk Architecture ?

Shared Nothing Architecture

Shared Disk Architecture

In a shared-nothing architecture, the nodes do not share memory or storage.

In shared disk architecture the nodes share the storage.

Here the disks have individual nodes which cannot be shared.

Here the disks have active nodes which are shared in case of failures.

It has cheaper hardware as compared to shared disk architecture.

The hardware of is shared disk is comparatively expensive.

The data is strictly partitioned.

The data is not partitioned.

It has fixed load balancing.

It has dynamic load balancing.

Scaling up in terms of capacity is easier. For getting more space, a new node can be added to the cluster. 

Its clustering architecture, which makes use of a single disc device with distinct memories, can have its memory capacity increased by upgrading the memory.

Its advantage is that it has high availability.

Its advantage is that it has unlimited scalability.

Pros-

·       Easy to scale

·       reduces single points of failure, makes upgrades easier, and avoids downtime

Pros-

·       It can scale up to a fair amount of CPUs.

·       Each processor possess its own memory so the memory bus is not an obstruction.

·       Fault tolerance as the database is stored on discs that are accessible from all processors so in that case other processors can take over the task if one fails.

Cons-

·       Deterioration in performance

·       Expensive

Cons-

·       No scalability in the architecture because the disc subsystem’s interconnection is currently the bottleneck.

·       Slower CPU to CPU communication because of passing through a communication network.

·       Slow down in the speed of current CPUs because of added more CPUs leads to the increased competition for memory access in network bandwidth.

38. Explain what sharding is?

Sharding is a method software architects use to split and store one logical dataset within several databases. Such distribution in several machines facilitates the ability to store a bigger dataset.

39. Explain why layering an application is vital?

There are several reasons why it's important to divide an application into layers, such as:

Improves component classification: Thanks to the separation of concerns, each layer performs its own function. This is beneficial because it becomes easier to assign every component with its own individual classification and helps you develop effective responsibility models and roles in your software architecture.

Low overhead costs: Unlike other software architectures, the division of layers in an application simplifies the development process and can significantly reduce overhead costs. Therefore, you can allocate your savings towards fulfilling other important development operations.

Easier to write and develop applications: When you divide an application into layers, it becomes easier to work on them as individual units, rather than as one large complex system. This type of process means that you can develop the entire application more effectively.

Easier to test applications: With there being separate layers within the software architecture, you can test each layer one at time. This is beneficial because you can gather critical information about every layer without mixing up any of the data.

Benefits from layers of isolation: The layers of isolation concept refers to your ability to change one layer of the architecture, without those changes directly affecting components within any of the other layers. This is important because it means you can make modifications to the layers without fear of them negatively impacting one another.

Improves problem-solving initiatives: With each individual layer having a singular focus, it becomes easier to solve problems within those layers. When you solve an issue with one layer, you can move on to addressing other problems in other layers without your solutions causing any adverse effects.

Easier to identify errors: Because there are multiple layers for you to analyze, it's easier for you to identify any errors within them. You can then implement fixes to address those errors in a timely manner before they have the opportunity to escalate.

Highly effective for monolithic applications: Because all the layers and functionalities exist within one architecture, is particularly useful for monolithic applications. These types of applications are simple to develop and you can often complete them quickly.

Well-known in the industry: Most software developers are familiar with the layered architecture pattern, which can be beneficial for you and your colleagues working on a project. This means that it's easy to collaborate on with other skilled professionals.

40. Explain what cache stampede means?

A cache stampede occurs when several threads attempt to access a cache in parallel. If the cached value doesn’t exist, the threads will then attempt to fetch the data from the origin at the same time. The origin is commonly a database but it can also be a web server, third-party API, or anything else that returns data.

One of the main reasons why a cache stampede can be so devastating is because it can lead to a vicious failure loop:

·       A substantial number of concurrent threads get a cache miss, leading to them all calling the database.

·       The database crashes due to an enormous CPU spike and leads to timeout errors.

·       Receiving the timeout, all the threads retry their requests — causing another stampede.

·       On and on the cycle continues.

41. Avoiding Cache Stampede?

Cache stampede is a fancy name for a cascading failure that will occur when caching mechanisms and underlying application layers come under high load both at the same time in a very short time.

Cache stampede is not something that will be considered a problem for most of the systems out there; it is an issue that one will encounter when reaching a certain scale and specific data access patterns.

 

To give you an example imagine you are doing an expensive SQL query that takes 3 seconds to complete and spikes CPU usage of your database server. You want to cache that query as running multiple of those in parallel can cause database performance problems from which your database will not recover bringing your entire app down.

Let’s say you have a traffic of 100 requests a second. With our 3-second query example, if you get 100 requests in one second and your cache is cold, you’ll end up with 300 processes all running the same uncached query.

That becomes a real problem with high load systems that will receive hundreds or even thousands requests a second to perform an initial population of the cache (or re-populating it) causing massive amounts of simultaneous writes to cache and a massive amount of traffic going to underlying systems. This has the potential to cause major problems for your infrastructure.

Avoiding Cache Stampede

In general, there are three approaches to avoid cache stampede:

·       external recomputation,

·       probabilistic early expiration,

·       locking.

External recomputation

This involves keeping an external process or a system alive that will re-compute your values and re-populate cache either on predefined intervals or when a cache miss happens.

The first issue is working with a cold cache. As I mentioned above, a flood of cache misses in a short period of time has the potential to run many expensive operations in parallel.

Continually re-populating cache also comes with the drawback that if there is no uniform access pattern to your data you will end up with cache bloated with useless information.

Probabilistic early expiration

When an expiration of cache nears, one of the processes (or actors) in your system will volunteer to re-calculate its value and to re-populate cache with a new value.

Rails has this implemented as :race_condition_ttl setting on the cache itself.

For this approach to be effective you have to have at least one request per TTL; else your cache will go cold pushing you back to square one and facing a stampede.

If you are not setting some sort of lock to re-calculate cache while using probabilistic early expiration then you will end up with multiple processes hammering your cache and underlying systems computing and re-writing the same value.

Locking

This one is the simplest but also the most versatile out of the three. As soon as a process (or an actor) encounters a cache miss, it will set a global lock on re-computing the requested value. This will prevent other processes from attempting the same recompute. Once the first request computes and caches this value, it will write to the cache, and release the lock.

 

It will be up to you to decide what to do when another process encounters a cache miss and there is a global lock on a requested key. You can make it wait, make it return nil, raise an error, etc.

Locking has other benefits as well. You can keep your cache TTLs low and not have to worry about storing or serving stale data, in an effort to avoid a cache stampede.

You can also combine locking to avoid issues with cold cache while doing external recomputation of probabilistic expiration.

42. Explain what shared nothing architecture is?

Shared nothing architecture is an architecture that is used in distributed computing in which each node is independent and different nodes are interconnected by a network. Every node is made of a processor, main memory, and disk. The main motive of this architecture is to remove the contention among nodes. Here the Nodes do not share memory or storage. The disks have individual nodes which cannot be shared. It works effectively in a high volume and read-write environment.

43. Explain what shared Disk architecture is?

Shared Disk Architecture is an architecture that is used in distributed computing in which the nodes share the same disk devices but each node has its own private memory. The disks have active nodes which share memory in case of any failures. In this architecture, the disks are accessible from all the cluster nodes. This architecture has quick adaptability to the changing workloads. It uses robust optimization techniques.

44. What is the “robust” software building approach?

The Seven Components of a Robust Software Development Process

They are as follows:

1.       A steadfast development process that can provide interaction with users by identifying their spoken and unspoken requirements throughout the software life cycle.

2.      Provision for feedback and iteration between two or more development stages as needed

3.      An instrument to optimize design for reliability (or other attributes), cost, and cycle time at once at upstream stages. This particular activity, which addresses software product robustness, is one of the unique features of the RSDM, because other software development models do not.

4.     Opportunity for the early return on investment that incremental development methods provide.

5.      Step-wise development to build an application as needed and to provide adequate documentation.

6.     Provision for risk analyses at various stages.

7.      Capability to provide for object-oriented development.

45. What is the difference between the “fail fast” and “robust” software building approaches?

Some people recommend making your software robust by working around problems automatically. This results in the software “failing slowly.” The program continues working right after an error but fails in strange ways later on.

A system that fails fast does exactly the opposite: when a problem occurs, it fails immediately and visibly. Failing fast is a nonintuitive technique: “failing immediately and visibly” sounds like it would make your software more fragile, but it actually makes it more robust. Bugs are easier to find and fix, so fewer go into production.

In overall, the quicker and easier the failure is, the faster it will be fixed. And the fix will be simpler and also more visible. Fail Fast is a much better approach for maintainability.

46. Explain heuristic expressions?

A Heuristic Exception refers to a transaction participant’s decision to unilaterally take some action without the consensus of the transaction manager, usually as a result of some kind of catastrophic failure between the participant and the transaction manager.

In a distributed environment communications failures can happen. If communication between the transaction manager and a recoverable resource is not possible for an extended period of time, the recoverable resource may decide to unilaterally commit or rollback changes done in the context of a transaction. Such a decision is called a heuristic decision. It is one of the worst errors that may happen in a transaction system, as it can lead to parts of the transaction being committed while other parts are rolled back, thus violating the atomicity property of transaction and possibly leading to data integrity corruption.

Because of the dangers of heuristic exceptions, a recoverable resource that makes a heuristic decision is required to maintain all information about the decision in stable storage until the transaction manager tells it to forget about the heuristic decision. The actual data about the heuristic decision that is saved in stable storage depends on the type of recoverable resource and is not standardized. The idea is that a system manager can look at the data, and possibly edit the resource to correct any data integrity problems.

47. Explain what cohesion means in software architecture?

Cohesion is the degree to how strongly related and focused are the various responsibilities of a module. It is a measure of the strength of the relationship between the class’s methods and data themselves. We should strive to maximize cohesion. High cohesion results in better understanding, maintaining, and reusing components.

Cohesion is increased if:

·       The functionalities embedded in a class, accessed through its methods, have much in common.

·       Methods carry out a small number of related activities, by avoiding coarsely grained or unrelated sets of data.

·       Related methods are in the same source file or otherwise grouped together; for example, in separate files but in the same sub-directory/folder.

48. Explain what coupling means in software architecture?

Coupling is the degree to which each module depends on other modules; a measure of how closely connected two modules are. We should strive to minimize coupling.

Coupling is usually contrasted with cohesion. Low coupling often correlates with high cohesion and vice versa.

Tightly coupled modules have the following disadvantages:

 

·       Change in one module might break another module.

·       Change in one module usually forces a ripple effect of changes in other modules.

·       Reusability decreases as dependency over other modules increases.

·       Assembly of modules might require more effort and/or time.

Coupling can be reduced by:

·       By hiding inner details and interacting through interfaces.

·       Avoid interacting with classes that it can avoid directly dealing with.

·       Components in a loosely coupled system can be replaced with alternative implementations that provide the same services.

49. What is eventual consistency?

Unlike relational database property of Strict consistency, eventual consistency property of a system ensures that any transaction will eventually (not immediately) bring the database from one valid state to another. This means there can be intermediate states that are not consistent between multiple nodes.

Eventually consistent systems are useful at scenarios where absolute consistency is not critical. For example in case of Twitter status update, if some users of the system do not see the latest status from a particular user its may not be very devastating for system.

Eventually consistent systems can not be used for use cases where absolute/strict consistency is required. For example a banking transactions system can not be using eventual consistency since it must consistently have the state of a transaction at any point of time. Your account balance should not show different amount if accessed from different ATM machines.

50. Explain what the GOD class is, Why should you avoid the GOD class?

The most effective way to break applications it to create GOD classes. That are classes that keeps track of a lot of information and have several responsibilities. One code change will most likely affect other parts of the class and therefore indirectly all other classes that uses it. That in turn leads to an even bigger maintenance mess since no one dares to do any changes other than adding new functionality to it.

51. What is Unit test, Integration Test, Smoke test, Regression Test and what are the differences between them?

Unit test: Specify and test one point of the contract of single method of a class. This should have a very narrow and well defined scope. Complex dependencies and interactions to the outside world are stubbed or mocked.

Integration test: Test the correct inter-operation of multiple subsystems. There is whole spectrum there, from testing integration between two classes, to testing integration with the production environment.

Smoke test (aka Sanity check): A simple integration test where we just check that when the system under test is invoked it returns normally and does not blow up.

Smoke testing is both an analogy with electronics, where the first test occurs when powering up a circuit (if it smokes, it’s bad!)…… and, apparently, with plumbing, where a system of pipes is literally filled by smoke and then checked visually. If anything smokes, the system is leaky.

Regression test: A test that was written when a bug was fixed. It ensures that this specific bug will not occur again. The full name is “non-regression test”. It can also be a test made prior to changing an application to make sure the application provides the same outcome.

To this, I will add:

Acceptance test: Test that a feature or use case is correctly implemented. It is similar to an integration test, but with a focus on the use case to provide rather than on the components involved.

System test: Tests a system as a black box. Dependencies on other systems are often mocked or stubbed during the test (otherwise it would be more of an integration test).

Pre-flight check: Tests that are repeated in a production-like environment, to alleviate the ‘builds on my machine’ syndrome. Often this is realized by doing an acceptance or smoke test in a production like environment.

Canary test is an automated, non-destructive test that is run on a regular basis in a LIVE environment, such that if it ever fails, something really bad has happened. Examples might be:

·       Has data that should only ever be available in DEV/TEST appeared in LIVE.

·       Has a background process failed to run

·       Can a user logon

52. Why is it a good idea for “lower” application layers not to be aware of “higher” ones?

The fundamental motivation is this:

 

You want to be able to rip an entire layer out and substitute a completely different (rewritten) one, and NOBODY SHOULD (BE ABLE TO) NOTICE THE DIFFERENCE.

The most obvious example is ripping the bottom layer out and substituting a different one. This is what you do when you develop the upper layer(s) against a simulation of the hardware, and then substitute in the real hardware.

Also layers, modules, indeed architecture itself, are means of making computer programs easier to understand by humans.

53. Explain the difference between deadlock,  livelock and starvation?

https://www.baeldung.com/cs/deadlock-livelock-starvation

In a multiprogramming environment, more than one process may compete for a finite set of resources. If a process requests for a resource and the resource is not presently available, then the process waits for it. Sometimes this waiting process never succeeds to get access to the resource. This waiting for resources leads to three scenarios – deadlock, livelock, and starvation.

Deadlock

A deadlock is a situation in which processes block each other due to resource acquisition and none of the processes makes any progress as they wait for the resource held by the other process. deadlock scenario between process 1 and process 2. Both processes are holding one resource and waiting for the other resource held by the other process. This is a deadlock situation as neither process 1 or process 2 can make progress until one of the processes gives up its resource.


Conditions for Deadlock

To successfully characterize a scenario as deadlock, the following four conditions must hold simultaneously:

Mutual Exclusion: At least one resource needs to be held by a process in a non-sharable mode. Any other process requesting that resource needs to wait.

Hold and Wait: A process must hold one resource and requests additional resources that are currently held by other processes.

No Preemption: A resource can’t be forcefully released from a process. A process can only release a resource voluntarily once it deems to release.

Circular Wait: A set of a process {p0, p1, p2,.., pn} exists in a manner that p0 is waiting for a resource held by p1, pn-1 waiting for a resource held by p0.

 

How to Prevent Deadlock

To prevent the occurrence of deadlock, at least one of the necessary conditions discussed in the previous section should not hold true. Let us examine the possibility of any of these conditions being false:

Mutual Exclusion: In some cases, this condition can be false. For example, in a read-only file system, one or more processes can be granted sharable access. However, this condition can’t always be false. The reason being some resources are intrinsically non-sharable. For instance, a mutex lock is a non-sharable resource.

Hold and Wait: To ensure that the hold-and-wait condition never occurs, we need to guarantee that once a process requests for a resource it is not holding any other resource at that time. In general, a process should request all resources before it begins its execution.

No Preemption: To make this condition false, a process needs to make sure that it automatically releases all currently held resources if the newly requested resource is not available.

Circular Wait: This condition can be made false by imposing a total ordering of all resource types and ensure that each process requests resources in increasing order of enumeration. Thus, if there is a set of n resources {r1,r2,..rn}, a process requires resource r1 and r2 to complete a task, it needs to request r1 first and then r2.

Livelock

In the case of a livelock, the states of the processes involved in a live lock scenario constantly change. On the other hand, the processes still depend on each other and can never finish their tasks.




Both “process 1” and “process 2” need a common resource. Each process checks whether the other process is in an active state. If so, then it hands over the resource to the other process. However as both, the process is inactive status, both kept on handing over the resource to each other indefinitely

A real-world example of livelock occurs when two people make a telephone call to each other and both find the line is busy. Both gentlemen decide to hang up and attempt to call after the same time interval. Thus, in the next retry too, they ended up in the same situation. This is an example of a live lock as this can go on forever.

Difference Between Deadlock and Livelock?

Although similar in nature, deadlock, and live locks are not the same. In a deadlock, processes involved in a deadlock are stuck indefinitely and do not make any state change. However, in a live lock scenario, processes block each other and wait indefinitely but they change their resource state continuously. The notable point is that the resource state change has no effect and does not help the processes make any progress in their task.

Starvation

Starvation is an outcome of a process that is unable to gain regular access to the shared resources it requires to complete a task and thus, unable to make any progress. example of starvation of “process 2” and “process 3” for the CPU as “process 1” is occupying it for a long duration.




What Causes Starvation?

Starvation can occur due to deadlock, livelock, or caused by another process. Starvation is the outcome of a deadlock, livelock, or as a result of continuous resource denial to a process.

As we have seen in the event of a deadlock or a live lock a process competes with another process to acquire the desired resource to complete its task. However, due to the deadlock or livelock scenario, it failed to acquire the resource and generally starved for the resource.

Further, it may occur that a process repeatedly gains access to a shared resource or use it for a longer duration while other processes wait for the same resource. Thus, the waiting processes are starved for the resource by the greedy process.

Avoiding Starvation

One of the possible solutions to prevent starvation is to use a resource scheduling algorithm with a priority queue that also uses the aging technique. Aging is a technique that periodically increases the priority of a waiting process. With this approach, any process waiting for a resource for a longer duration eventually gains a higher priority. And as the resource sharing is driven through the priority of the process, no process starves for a resource indefinitely.

Another solution to prevent starvation is to follow the round-robin pattern while allocating the resources to a process. In this pattern, the resource is fairly allocated to each process providing a chance to use the resource before it is allocated to another process again.

54. Describe best practices for performance testing.

1. Understand your application:

Prior to implementing the application, it is essential to understand the application, its capabilities and offerings, its intended use, and the conditions where it is expected to thrive. Additionally, the team needs to develop an understanding of the probable limitations of the app. Listing out the common factors that might impact the performance can be an effective practice, followed by deploying these parameters while testing.

2. Setting realistic performance benchmarks

Businesses often end up developing unrealistic expectations. Hence, it is essential to set realistic baselines by selecting practical and realistic scenarios. Teams should ensure that the testbeds include multiple varieties of devices and environments where the app needs to thrive. For instance, several tests are executed right from a zero value, followed by adding load until it reaches the desired threshold. Nonetheless, this scenario is not realistic, and often engineers get a false picture of the system load, as the load can never reduce to nil and then progress further from that value.

3. Configuring the environment

In the initial stages post the test plans, a QA team should build a toolkit of load generation and performance monitoring tools. The testers create a bank of IP addresses that can be leveraged during sessions. As the project proceeds, it becomes a common practice to modify, change or expand the server performance testing toolkit for providing a broader view of the app performance.

4. Testing early and regularly

Performance testing has often been a throwaway sought in the later stages of the development cycle. However, to achieve satisfactory results from the app, performance tests must be at the crux, executed in the initial stages, and in a proactive manner. The earlier it is done, the better the team can identify and detect the bottlenecks with enough time in hand to properly eliminate them. Further, it becomes more complex and costly to implement modifications in the later stages of the development cycle. Thus, the best practice would be to perform these tests as part of the unit tests that will assist in quickly identifying performance issues and rectifying the same. It is wise to incorporate an agile strategy with agile methodologies trending today, employing iterative testing throughout the development lifecycle. Besides, teams should allow performance unit testing to be a part of the development process and later repeat similar tests on broader scales across subsequent stages for evaluating the app's preparation and maturity.

5. Understanding performance from the point-of-view of the end-users

There is a common tendency that performance tests focus more on the servers and clusters running software, resulting in inadequate measurement of the human elements. Measuring the performance of clustered servers might return a positive result, but users on a single overloaded server might experience unsatisfactory results. Instead, it is a better approach to also include the user's experience along with server experiences and responses. The tests should systematically capture every user's experience and interface timings with synchronicity to the metrics derived from the server. Combining the user perspectives and including a Beta version of the app can help capture the complete user experience seamlessly.

6. Performing System Performance Tests

Applications are built on many individual complex systems that include databases, app servers, web services, legacy systems, and many more. While conducting app performance testing, these systems should undergo rigorous performance testing individually and together. This modular testing approach helps detect weak links, identify the systems that can harm others, and determine which systems should be isolated for further app performance optimization.

7. Building a complete performance model

To measure the application's performance, one needs to understand the system's capacity. This practice involves planning what would be the steady-state concerning concurrent users, simultaneous requests, average user sessions, and server utilization during the peaks of the day. Furthermore, it is essential to define performance goals like maximum response times, system scalability, good performance metrics, user satisfaction marks, and the maximum capacity for the performance metrics.

It is also vital to define related thresholds that will send alerts for potential performance issues as the test passes through those thresholds. With increasing levels of risk, additional thresholds need to be defined.

Building this complete performance model and planning the processing should include:

·       Key performance indicators (KPIs), which include average latency, request and response times, server utilization

·       Business process completion rate involving the transactions per second and system throughput load profiles for average, peak, and spike tests

·       Hardware metrics that include CPU usage, memory usage, and network usage

8. Defining baselines for critical system functions

Most often, QA systems do not match with the production systems. In such scenarios, having baseline performance measurements helps to provide more reasonable goals for every environment utilized for testing. These especially provide an appropriate starting point for response time goals with no previous metrics involved, without having to identify and base those on other applications. Baseline performance testing and measurement like single-user login time and response time for individual screens should preferably be executed with no system load.

9. Consistent reporting and result analysis

Planning, designing, and executing performance tests are crucial but not enough. Along with these, reports must also be an area of focus. Efficient reporting allows conveying critical information and insights into the overall performance analyses and the outcomes of the app's activities, especially to project managers and developers. Analyzing and reporting consistently help in the development of future fixes. Moreover, the developer reports must be distinct from those provided to project managers, owners, and corporate executives.

10. Understand Performance Test Definitions

It’s crucial to have a common definition for the types of performance tests that should be executed against your applications, such as:

Single User Tests. Testing with one active user yields the best possible performance, and response times can be used for baseline measurements.

Load Tests. Understand the behavior of the system under average load, including the expected number of concurrent users performing a specific number of transactions within an average hour.

Peak Load Tests. Understand system behavior under the heaviest anticipated usage for concurrent number of users and transaction rates.

Endurance (Soak) Tests. Determine the longevity of components, and whether the system can sustain average to peak load over a predefined duration. Monitor memory utilization to detect potential leaks.

Stress Tests. Understand the upper limits of capacity within the system by purposely pushing it to its breaking point.

High Availability Tests. Validate how the system behaves during a failure condition while under load. There are many operational use cases that should be included, such as seamless failover of network equipment or rolling server restarts.

Some additional practices for mobile applications

·       Considering network quality as latency tends to be higher on mobile networks with unpredictable connections.

·       Considering the entire mobile product families' performance and available resources often vary within product families (like iOS, iPhone 12, and iPhone 12 mini) and even more with android devices.

·       Tests must be device-agnostic.

·       Using emulation to a certain extent when it might not be feasible to install the app on the actual devices and fit multiple demands of several actual devices.

·       Deploying end-to-end tests as mobile apps are only as good as their back-end server response time plus their own processing time.

·       Expecting higher user expectations and more concurrent users.

·       Performing capacity testing, including the low memory and out-of-storage conditions.

55. Describe three metrics that measure performance testing?

You need to be able to make sense of what has and hasn't been tested and report on that work to stakeholders, who may not all be technologically inclined. Some people love monstrous spreadsheets. Most, however, want summarized or visualized data that answers a few key questions.

So, think about how to plan, execute and report on your performance testing efforts in a meaningful way. Ask yourself these three questions:

1. Can it go faster?

The efficiency of any software application is key to its success. According to Neil Patel and Kissmetrics: 'If an e-commerce site is making $100,000 per day, a one second page delay could potentially cost you $2.5 million in lost sales every year.'

These performance testing metrics include:

·       Average load times

·       Response times

·       Hits/connections per second

·       Network bytes total per second

·       Network output queue length

·       Throughput for received requests

·       Garbage collection

Remember: don't just focus on averages. An average is only useful when taken in conjunction with the standard deviation across datapoints, or even better, as a percentile. For example, 'average page load time is less than 0.1 seconds, 99 percent of the time.'

2. Can it go farther?

How many resources does a certain function use? Can the number of users scale up or down? Think of it as if you were buying a kid new shoes - does the application have room to grow, and at what point will you reach catastrophic failure (and the shoe seams burst!)?

These performance testing metrics include:

·       Top wait times for retrieving data from memory

·       Bandwidth (bits per second)

·       Memory/disk space/CPU usage

·       Amount of latency/lag

·       Concurrent users

·       Private bytes

You're looking for bottlenecks that slow or halt performance, usually caused by coding errors or database design, although it's worth noting that hardware may contribute to these as well. Running endurance tests reveals how these issues appear over time, even if they don't turn up initially.

3. Can it go forever?

Consistency is key. A software application should work the same way every time. You want to measure how stable or reliable it is over time, especially in the face of spikes in usage or other unexpected events.

These performance testing metrics include:

·       Page faults (per second)

·       Committed memory

·       Maximum active sessions

·       Thread counts

·       Transactions passed or failed

·       Error rate

All types of software testing are, really, about finding breaking points. This is never more important than if you experience rapid demand, such as an increase in popularity. Will your software be able to cope with a surge of new users, such as those seen by Netflix and TikTok in 2020? If not, you risk missing out on a big opportunity.

Start going faster, farther, forever... earlier?

Performance testing should not be the last stage in development. By anticipating issues early on and planning ahead, you save yourself the headache of fixing performance problems uncovered during testing. That's why you'll want to engage a quality assurance engineer at the planning phase of your next new build or feature.

'Only conducting performance testing at the conclusion of system or functional testing is like conducting a diagnostic blood test on a patient who is already dead.'

— Scott Barber, Performance Architect

56. Behavioral software architect interview questions and answers?

A. How do you stay up to date with the latest developments in the software architect field?

Some of the best approaches to stay up to date with the latest developments in the software architect field include:

·       Reading technical books

·       Working on side projects

·       Reading blogs

·       Completing courses

B. What are your favorite programming languages?

Each candidate may have a different response to this question, or they may not have a clear favorite. But it’s vital that your candidates can give rational and clear explanations for their choices.

For instance, if they don’t have a favorite language, they may explain that certain languages are better for particular projects. 

C. Have you ever failed when completing a project? What did you learn from the failure?

Each of your candidates will likely have experienced a time when they couldn’t complete a project. But they should have learned from the failure. For example, a candidate may describe a project they managed that was particularly big and complicated.

They may have had to coordinate between several teams, and although the project wasn’t as successful as hoped they may have learned valuable techniques to handle complex coordination.

D. What is your approach to task delegation?

It’s essential to get the right balance between delegating all tasks and completing every task without team support. Individual initiative is vital, but so is relying on your team.

The candidates you should watch out for are those who explain clearly that keeping an eye on the team and the tasks that have been delegated is important.

E. Which features do you dislike about your favorite programming language?

Candidates may respond in a variety of ways to this question. But in general, the more limited their response, the lower their level of expertise is likely to be.

For example, suppose a candidate points out that there are whitespace delimitations to Python code blocks. In that case, they may not fully understand the complexities of this programming language’s style and philosophy.

 

https://workat.tech/machine-coding/tutorial/software-design-principles-abstraction-extensibility-cohesion-acafi2r32c78

https://www.maixuanviet.com/question/category/software-architecture

  

57. What is a 12-Factor Web App?
The 12-factor methodology is a set of twelve principles to develop applications that can be cloud-native, independent, robust, and resilient. This was originally drafted in 2011 by Heroku for applications deployed as services on their cloud platform. 12-factor app principles became very popular as it aligns with principles of Microservices and it is an influential pattern for designing scalable application architecture.

 

The 12 Factors

The following are the 12 factors that we are going to see in detail with illustrations.

 

1.       Codebase

2.      Dependencies

3.      Config

4.      Backing Services

5.      Build, release, and Run

6.      Processes

7.      Port Binding

8.     Concurrency

9.      Disposability

10.  Dev/prod parity

11.   Logs

12.  Admin processes




1. Codebase

“One codebase tracked in revision control, many deploys”

A microservice must consist of a single repository, that is tracked by a version-control system like Git (including GitHub, GitHub Enterprise, GitLab, BitBucket, etc). If there are 20 microservices in an application, then there must be 20 individual repositories. 

The single repository helps development teams to develop the microservices independently and they can be language agnostic (i.e they can be developed in different languages). This will support collaboration between teams and helps to enable proper versioning of applications.

 

2. Dependencies

“Explicitly declare and isolate the dependencies”

Most applications require the use of external dependencies to build the code and run it. Rather than packaging the dependencies inside the microservice application, these dependencies must be pulled during the build process. This will simplify the setup required for any new developers and also provide consistency between development, staging, and production environments.

3. Config

“Store configurations in an environment”

Configuration refers to any value that can vary across deployments (e.g., developer machine, Dev, QA, production, etc). This might be of the following

·       URL and credentials to connect to databases

·       URLs and other info for other services like Logs, Caching, etc

·       Credentials to third-party services such as Amazon AWS, Payment Providers, etc

There must be a clean separation of configuration and the code. If we ship the config with the code, it will fail in other environments and also there is a risk of credentials leak. Externalizing configuration is very important for a microservice application and it enables us to deploy our applications to multiple environments regardless of which runtime we are using.

4. Backing Services

“Treat backing resources as attached resources”

 

Backing services are any processes that the microservice communicates over the network during the operation. Examples include Databases (e.g. MySQL, PostgreSQL), Cache, Message broker, etc.

The Backing Services principle encourages architects and developers to treat components such as databases, email servers, message brokers, and independent services that can be provisioned and maintained as attached resources.

The resource can be swapped at any given point in time without impacting the service. Eg: Let us say that you are using the MySQL database in AWS and would like to change to Aurora. Without making any code changes to your application you can do it just with a config change.

5. Build, release, and Run

“Strictly separate build and run stages”

 

The deployment process for a microservice application is divided into 3 stages namely Build, Release, and Run.

In Build stage, the source code is retrieved from the source code management tool like Git and then the dependencies are gathered and bundled into the build artifact (e.g. a WAR or JAR file). The output of this build phase is a packaged server containing all of the environment-agnostic configurations required to run the application.

The Release stage retrieves the artifacts from the build stage and applies configuration values (both environmental and app-specific) to that to produce another release. This release is tagged By labeling these releases with unique IDs which will help to roll back to the previous version of the deployment if needed.

The last stage is the Run stage, which usually occurs on the cloud provider, and usually uses tooling like containers or Terraform/Ansible to launch the application. Finally, the application and its dependencies are deployed into the newly provisioned runtime environment like Kubernetes Pods or EC2 server or Lambda functions, etc.

The Build, Release, and Run is completely ephemeral which means that all artifacts and environments can be reconstructed from scratch (if anything goes wrong in the pipeline) using assets stored in the source code repository.

 

6. Processes

“Execute the app as one or more stateless processes”

Twelve-factor processes are stateless and share-nothing. Any data that needs to persist must be stored in a stateful backing service, typically a database.

The principle of Processes states that the 12 Factor App should be stateless and it should not store the data in-memory of the application. The sticky sessions should not be used as well. The memory space or filesystem of the process/application can be used only as a temporary one.

For example, downloading a large file, operating on it, and storing the results of the operation in the database. Any data to be stored should be stored in a stateful backing service like a database or distributed cache that can be read by all the instances of the application.

When a process is stateless, instances can be horizontally scaled up and down, and having statelessness will prevent any unintended side effects.

 

7. Port Binding

“Export services via port binding”

The twelve-factor app is completely self-contained and does not rely on the runtime injection of a webserver into the execution environment to create a web-facing service. The web app exports HTTP as a service by binding to a port, and listening to requests coming in on that port.

The principle of Port Binding states that a service or application should be identifiable on the network by its port number and not a domain name. This is because the domain names and IP addresses can change dynamically due to the automated discovery mechanisms and hence using them is not reliable. Using a port number will always make the microservice identifiable and more reliable.

Port number is the best way to expose a process to the network. Eg: For example, port 80 is used for web servers port that uses HTTP, port 443 is the default port number for HTTPS, port 22 is for SSH, port 3306 is the default port for MySQL, and port 27017 is the default port for MongoDB. Likewise, all our microservices must have their unique port numbers defined and the applications must run on the designated ports.

 

8. Concurrency

“Scale out via the process model”

The principle of Concurrency talks about scaling the application. Twelve-factor app principles suggest running your application as multiple processes/instances (Horizontal Scaling) instead of running in one large system i.e Vertical scaling. By adopting containerization, applications can be scaled horizontally as per the demands.

An application is exposed to the network via web servers that operate behind a load balancer. Then the web servers communicate with the Business Service that runs behind another load balancer. In case of load, the Business layer can be scaled up independently. If concurrency is not supported in the architectures like Monolith, then the entire application must be scaled up.

 

9. Disposability

“Maximize the robustness with fast startup and graceful shutdown”

When the application is shutting down or starting up, an instance should not impact the application state.

Applications should aim to minimize startup time and the process must take only a few seconds to launch and be ready to receive requests or jobs.

Applications can crash due to various reasons and the system should ensure that the impact would be minimal and that the application should be stored in a valid state.

By adopting containerization into the deployment process of microservices, we can make the application disposable. Docker containers can be started or stopped instantly. Storing request, state, or session data in queues or other backing services ensures that a request is handled seamlessly in the event of a container crash.

 

10. Dev/Prod parity

“Keep development, staging, and production as similar as possible”

The dev/prod parity principle focuses on the importance of keeping development, staging, and production environments as similar as possible. This will eliminate unexpected issues when the code is deployed to production.

Tools like Docker are very helpful to implement this dev/test/prod parity. The benefit of a container is that it provides a uniform environment for running code by packaging the code and the dependencies required to run the code in the form of a docker image. Containers enable the creation and use of the same image in development, staging, and production. It also helps to ensure that the same backing services are used in every environment.

11. Logs

“Treat logs as event streams”

Logs are the stream of aggregated, time-ordered events collected from the output streams of all running processes and backing services. Logs in their raw form are typically a text format with one event per line. Logs have no fixed beginning or end but flow continuously as long as the app is operating.

The logs factor highlights the importance of ensuring that your application doesn’t concern itself with routing, storage, or analysis of its output stream. In cloud-native applications, the aggregation, processing, and storage of these logs is the responsibility of the cloud provider or other tool suites (e.g., ELK stack, Splunk, Sumologic, etc.) running alongside the cloud platform being used. This factor helps to improve flexibility for introspecting behavior over time and enables real-time metrics to be collected and analyzed effectively over time.

12. Admin processes

“Run admin/management tasks as one-off processes”

There are a number of one-off processes as part of the application deployment like data migration, and executing one-off scripts in a specific environment.

Twelve-factor principles advocates for keeping such administrative tasks as part of the application codebase in the repository. By doing so, one-off scripts follow the same process defined for your codebase and they should be run in an identical environment as the regular long-running processes of the app.

They run against a release, using the same codebase and config as any process run against that release. Admin code must ship with application code to avoid synchronization issues.



No comments:

Post a Comment