1. What do software architects do?
Software architects are expert, professional developers who share
information between software engineering teams and clients to implement precise
software design solutions. Some of their primary responsibilities are:
- Project code QA testing
- Task distribution for software
engineer teams
- Technical standards evaluation
- Breaking down project goals into
deliverable tasks
2. What is meant by the KISS principle?
Answer:
KISS, a
backronym for "keep it simple, stupid", is a design principle noted
by the U.S. Navy in 1960. The KISS principle states that most systems work best
if they are kept simple rather than made complicated; therefore simplicity
should be a key goal in design, and that unnecessary complexity should be
avoided.
Source: stackoverflow.com
3. Why is it a good idea for “lower” application
layers not to be aware of “higher” ones?
Answer:
The fundamental motivation is this:
You want to be able to rip an entire layer
out and substitute a completely different (rewritten) one, and NOBODY SHOULD (BE
ABLE TO) NOTICE THE DIFFERENCE.
The most obvious example is ripping the
bottom layer out and substituting a different one. This is what you do when you
develop the upper layer(s) against a simulation of the hardware, and then
substitute in the real hardware.
Also layers, modules, indeed architecture
itself, are means of making computer programs easier to understand by humans.
Source: stackoverflow.com
4. Which technical skills are required to be a
successful software architect?
As well as knowledge of
unified modeling language (UML), software architects need to have skills in
various programming languages. They should also understand agile
management and collaboration methods so they can align development and
operations.
5. Which soft skills are required to be a
successful software architect?
A crucial soft skill for
software architects is effective leadership, but there are other essential
skills, too. Some other soft skills required to be a good software architect
include:
- Communication skills
- Coaching abilities
- Prioritization skills
6. What Is Load Balancing?
Answer:
Load balancing is simple technique for distributing
workloads across multiple machines or clusters. The most common and simple load
balancing algorithm is Round Robin. In this type of load balancing the request
is divided in circular order ensuring all machines get equal number of requests
and no single machine is overloaded or underloaded.
The Purpose of load balancing is to
- Optimize resource usage (avoid overload and
under-load of any machines)
- Achieve Maximum Throughput
- Minimize response time
Most common load balancing techniques in web
based applications are
- Round robin
- Session affinity or sticky session
- IP Address affinity
Source: fromdev.com
7. What Is CAP Theorem?
Answer:
The CAP Theorem for distributed computing was published by Eric Brewer. This
states that it is not possible for a distributed computer system to
simultaneously provide all three of the following guarantees:
- Consistency (all
nodes see the same data even at the same time with concurrent updates )
- Availability (a
guarantee that every request receives a response about whether it was
successful or failed)
- Partition tolerance (the system continues to operate despite arbitrary message
loss or failure of part of the system)
The CAP acronym corresponds to these three
guarantees. This theorem has created the base for modern distributed computing
approaches. Worlds most high volume traffic companies (e.g. Amazon, Google,
Facebook) use this as basis for deciding their application architecture. It's
important to understand that only two of these three conditions can be
guaranteed to be met by a system.
Source: fromdev.com
8. Define Microservice Architecture
Answer:
Microservices, aka Microservice Architecture,
is an architectural style that structures an application as a collection of
small autonomous services, modeled around a business domain.
Source: lambdatest.com
9. Why use WebSocket over Http?
Answer:
A WebSocket is a continuous
connection between client and server. That continuous connection allows the
following:
1. Data can be
sent from server to client at any time, without the client even requesting it.
This is often called server-push and is very valuable for applications where
the client needs to know fairly quickly when something happens on the server
(like a new chat messages has been received or a new price has been udpated). A
client cannot be pushed data over http. The client would have to regularly poll
by making an http request every few seconds in order to get timely new data.
Client polling is not efficient.
2. Data can be sent
either way very efficiently. Because the connection is already established and
a webSocket data frame is very efficiently organized, one can send data a lot
more efficiently that via an HTTP request that necessarily contains headers,
cookies, etc...
Source: stackoverflow.com
10. What do you mean by lower latency
interaction?
Answer:
Low latency means that there is very little delay
between the time you request something and the time you get a response. As it
applies to webSockets, it just means that data can be sent quicker
(particularly over slow links) because the connection has already been
established so no extra packet roundtrips are required to establish the TCP
connection.
Source: stackoverflow.com
11. What Is Scalability?
Answer:
Scalability is the ability of a system,
network, or process to handle a growing amount of load by adding more
resources. The adding of resource can be done in two ways
- Scaling Up
This involves adding more resources to the existing nodes. For example, adding more RAM, Storage or processing power. - Scaling Out
This involves adding more nodes to support more users.
Any of the approaches can be used for
scaling up/out a application, however the cost of adding resources (per user)
may change as the volume increases. If we add resources to the system It should
increase the ability of application to take more load in a proportional manner
of added resources.
An ideal application should be able to serve
high level of load in less resources. However, in practical, linearly scalable
system may be the best option achievable. Poorly designed applications may have
really high cost on scaling up/out since it will require more resources/user as
the load increases.
Source: fromdev.com
12. Why Do You Need Clustering?
Answer:
Clustering is needed for achieving high availability for a
server software. The main purpose of clustering is to achieve 100% availability
or a zero down time in service. A typical server software can be running on one
computer machine and it can serve as long as there is no hardware failure or
some other failure. By creating a cluster of more than one machine, we can
reduce the chances of our service going un-available in case one of the machine
fails.
Doing clustering does not always guarantee
that service will be 100% available since there can still be a chance that all
the machine in a cluster fail at the same time. However it in not very likely
in case you have many machines and they are located at different location or
supported by their own resources.
Source: fromdev.com
13. What Is A Cluster?
Answer:
A cluster is group of computer machines that
can individually run a software. Clusters are typically utilized to achieve
high availability for a server software. Clustering is used in many types of
servers for high availability.
- App Server Cluster
An app server cluster is group of machines that can run a application server that can be reliably utilized with a minimum of down-time. - Database Server Cluster
An database server cluster is group of machines that can run a database server that can be reliably utilized with a minimum of down-time.
Source: fromdev.com
14. What is Domain Driven Design?
Answer:
Domain Driven Design is a methodology and process
prescription for the development of complex systems whose focus is mapping
activities, tasks, events, and data within a problem domain into the technology
artifacts of a solution domain.
It is all about trying to make your software
a model of a real-world system or process.
Source: stackoverflow.com
15. What defines a software architect?
Answer:
An architect is the captain
of the ship, making the decisions that cross multiple areas of concern
(navigation, engineering, and so on), taking final responsibility for the
overall health of the ship and its crew (project and its members), able to step
into any station to perform those duties as the need arises (write code for any
part of the project should they lose a member). He has to be familiar with the
problem domain, the technology involved, and keep an eye out on new
technologies that might make the project easier or answer new customers'
feature requests.
Source: stackoverflow.com
16. What does the expression “Fail Early” mean,
and when would you want to do so?
Answer:
Essentially, fail fast (a.k.a. fail
early) is to code your software such that, when there is a problem,
the software fails as soon as and as visibly as possible,
rather than trying to proceed in a possibly unstable state.
Fail Fast approach won’t reduce the overall number of bugs, at
least not at first, but it’ll make most defects much easier to find.
Source: stackoverflow.com
17. What does “program to interfaces, not
implementations” mean?
Answer:
Coding against interface means, the client code always holds an
Interface object which is supplied by a factory.
Any instance returned by the factory would
be of type Interface which any factory candidate class must have implemented.
This way the client program is not worried about implementation and the
interface signature determines what all operations can be done.
This approach can be used to change the
behavior of a program at run-time. It also helps you to write far better
programs from the maintenance point of view.
Source: tutorialspoint.com
18. What is Elasticity (in contrast to
Scalability)?
Answer:
Elasticity means that the throughput of a system scales up or
down automatically to meet varying demand as resource is proportionally added
or removed. The system needs to be scalable to allow it to benefit from the
dynamic addition, or removal, of resources at runtime. Elasticity therefore
builds upon scalability and expands on it by adding the notion of automatic
resource management.
19. Explain what test-driven
development means?
Test Driven Development is
a process in which you write the test before you write the code. And when all
tests are passing you make the code better.You’re (re-)designing your code to
make and keep it easily testable. And that makes it clean, uncomplicated, and
easy to understand and change.
Benefits of Test Driven
Development
20. Difference Between Cloud
Elasticity and Scalability :
Cloud Elasticity |
Cloud Scalability |
Elasticity is used just to meet the sudden up and down in
the workload for a small period of time. |
Scalability is used to meet the static increase in the
workload. |
Elasticity is used to meet dynamic changes, where the
resources need can increase or decrease. |
Scalability is always used to address the increase in
workload in an organization. |
Elasticity is commonly used by small companies whose
workload and demand increases only for a specific period of time. |
Scalability is used by giant companies whose customer
circle persistently grows in order to do the operations efficiently. |
It is a short term planning and adopted just to deal with
an unexpected increase in demand or seasonal demands. |
Scalability is a long term planning and adopted just to
deal with an expected increase in demand. |
21. What Is Backpressure?
Resistance or force opposing the desired flow of
data through software.
Let's use an example to clearly describe what it
is:
1. The
system contains three services: the Publisher, the Consumer, and the Graphical
User Interface (GUI)
2. The
Publisher sends 10000 events per second to the Consumer
3. The
Consumer processes them and sends the result to the GUI
4. The
GUI displays the results to the users
5. The
Consumer can only handle 7500 events per second
At this speed rate, the consumer cannot manage
the events (backpressure). Consequently, the system would collapse and the
users would not see the results.
How you handle backpressure can pretty much be
summed up with three possible options:
· Control
the producer (slow down/speed up is decided by consumer)
· Buffer
(accumulate incoming data spikes temporarily)
· Drop
(sample a percentage of the incoming data)
22. What is the difference
between the WebSocket and REST API?
Criteria |
Web Socket |
REST API |
Performance |
WebSockets have a low overhead per message. They’re ideal
for use cases that require low-latency, high-frequency communication. |
REST APIs have a higher message overhead compared to
WebSockets. They’re best suited for use cases where you want to create,
retrieve, delete, or update resources. |
Nature |
Socket-based. |
Resource-based. |
HTTP use |
WebSocket uses HTTP only during the initial
request/response handshake (connection establishment). |
REST uses HTTP to enable client-server communication. |
Communication |
Event-driven and bidirectional. |
Request-driven and unidirectional. |
State |
WebSocket is a stateful protocol. |
REST uses the HTTP protocol, which is stateless. |
TCP connection |
A WebSocket connection uses a single TCP connection for
data exchange. |
Every request/response requires a new TCP connection. |
23. Explain how concurrency is
different from parallelism.
Concurrency is when two or
more tasks can start, run, and complete in overlapping time periods. It doesn't
necessarily mean they'll ever both be running at the same instant. For example,
multitasking on a single-core machine.
Parallelism is when tasks
literally run at the same time, e.g., on a multicore processor.
Concurrency |
Parallelism |
Concurrency is the task of running and managing the
multiple computations at the same time. |
While parallelism is the task of running multiple
computations simultaneously. |
Concurrency is achieved through the interleaving operation
of processes on the central processing unit(CPU) or in other words by the
context switching. |
While it is achieved by through multiple central
processing units(CPUs). |
Concurrency can be done by using a single processing unit. |
While this can’t be done by using a single processing unit.
it needs multiple processing units. |
Concurrency increases the amount of work finished at a
time. |
While it improves the throughput and computational speed
of the system. |
Concurrency deals lot of things simultaneously. |
While it do lot of things simultaneously. |
Concurrency is the non-deterministic control flow
approach. |
While it is deterministic control flow approach. |
In concurrency debugging is very hard. |
While in this debugging is also hard but simple than
concurrency. |
24. Session affinity?
Session affinity is a
feature available on load balancers that allows all subsequent traffic and
requests from an initial client session to be passed to the same server in the
pool. Session affinity is also referred to as session persistence, server affinity,
server persistence, or server sticky.
25. Session Replication?
Session replication is a
mechanism used to replicate the data stored in a session across different
instances. However, the replicated instance must be part of the same cluster.
When session replication is enabled in a cluster environment, the entire session
data is copied on a replicated instance.
26. Session affinity(sticky
session) vs Session Replication?
If you're using session
replication without sticky session : Imagine
you have only one user using your web app, and you have 3 tomcat instances.
This user sends several requests to your app, then the loadbalancer will send
some of these requests to the first tomcat instance, and send some other of
these requests to the secondth instance, and other to the third.
If you're using sticky
session without replication : Imagine
you have only one user using your web app, and you have 3 tomcat instances.
This user sends several requests to your app, then the loadbalancer will send
the first user request to one of the three tomcat instances, and all the other
requests that are sent by this user during his session will be sent to the same
tomcat instance. During these requests, if you shutdown or restart this tomcat
instance (tomcat instance which is used) the loadbalancer sends the remaining
requests to one other tomcat instance that is still running, BUT as you don't
use session replication, the instance tomcat which receives the remaining
requests doesn't have a copy of the user session then for this tomcat the user
begin a session : the user loose his session and is disconnected from the web
app although the web app is still running.
If you're using sticky
session WITH session replication : Imagine
you have only one user using your web app, and you have 3 tomcat instances.
This user sends several requests to your app, then the loadbalancer will send
the first user request to one of the three tomcat instances, and all the other
requests that are sent by this user during his session will be sent to the same
tomcat instance. During these requests, if you shutdown or restart this tomcat
instance (tomcat instance which is used) the loadbalancer sends the remaining
requests to one other tomcat instance that is still running, as you use session
replication, the instance tomcat which receives the remaining requests has a copy
of the user session then the user keeps on his session : the user continue to
browse your web app without being disconnected, the shutdown of the tomcat
instance doesn't impact the user navigation.
27. What Is Middle Tier
Clustering?
Middle tier clustering is
just a cluster that is used for service the middle tier in a application. This
is popular since many clients may be using middle tier and a lot of heavy load
may also be served by middle tier that requires it be to highly available.
Failure of middle tier can
cause multiple clients and systems to fail, therefore its one of the approaches
to do clustering at the middle tier of a application.
In java world, it is really
common to have EJB server clusters that are used by many clients. In general
any application that has a business logic that can be shared across multiple
client can use a middle tier cluster for high availability.
28. Explain high availability
in the software architect field?
There are organizations that
require their systems to be operational 24/7. For these organizations, HA
architecture is essential. While HA does not guarantee that systems will not be
hit by unplanned interruptions, it minimizes the impact of such interruptions
on your operations. A more responsive system is another benefit of HA.
HA architecture ensures that
your systems are up and running and accessible to your users in the face of
unforeseen circumstances such as hardware and software failures. With it, you
use multiple components to ensure continuous and responsive service.
Redundant hardware: Lack of redundant hardware means no
requests can be served until a server is restarted after a crash. When this
happens, downtime is inevitable. Thus, your HA architecture must include backup
hardware such as servers or server clusters that take over automatically in
case of production hardware crashes.
Redundant software and
applications: To
prevent potential downtime whenever there are failures in the software and
applications used in your production environment, it is crucial that your HA
architecture includes backup software and applications.
Redundant data: Database servers that go offline for one
reason or another can wreak havoc on your production environment. Your HA
architecture should include provisions for backup database servers to which
processing can be shifted whenever a production database server goes offline.
No single point of failure: A failure in a single component should
not crash your entire infrastructure. With redundancy in hardware, software,
and data, single points of failure are eliminated.
29. What does fault tolerance
mean?
Fault tolerance is a process
that enables an operating system to respond to a failure in hardware or
software. This fault-tolerance definition refers to the system’s ability to
continue operating despite failures or malfunctions.
An operating system that
offers a solid definition for faults cannot be disrupted by a single point of
failure. It ensures business continuity and the high availability of crucial
applications and systems regardless of any failures.
30. How Does Fault Tolerance
Work?
Fault tolerance can be built
into a system to remove the risk of it having a single point of failure. To do
so, the system must have no single component that, if it were to stop working
effectively, would result in the entire system failing.
Fault tolerance is reliant
on aspects like load balancing and failover, which remove the risk of a single
point of failure. It will typically be part of the operating system’s
interface, which enables programmers to check the performance of data
throughout a transaction.
31. What does fault resilience
mean?
Fault resilience: Failure is
observed in some services. But rest of system continues to function normally.
Resilience means how many
faults the system can tolerate.
32. What is the DRY principle?
The DRY (don't repeat
yourself) principle is a best practice in software development that recommends
software engineers to do something once, and only once. The goal of the DRY
principle is to lower technical debt by eliminating redundancies in process and
logic whenever possible.
Redundancies in process
To prevent redundancies in
processes (actions required to achieve a result), followers of the DRY
principle seek to ensure that there is only one way to complete a particular
process. Automating the steps wherever possible also reduces redundancy, as
well as the number of actions required to complete a task.
Redundancies in logic
To prevent redundancies in
logic (code), followers of the DRY principle use abstraction to minimize
repetition. Abstraction is the process of removing characteristics until only
the most essential characteristics remain.
An important goal of the DRY
principle is to improve the maintainability of code during all phases of its
lifecycle. When the DRY principle is followed, for example, a software
developer should be able to change code in one place, and have the change
automatically applied to every instance of the code in question.
33. What is the DIE principle?
DIE in software development
is an acronym that means “duplication is evil.” The DIE principle is used in
the same situations as the DRY principle and aims to ensure that software
architects and developers avoid duplicating concepts. It also contributes to
efficient code maintainability.
34. Explain the ACID acronym?
Atomicity
All changes to data are
performed as if they are a single operation. That is, all the changes are
performed, or none of them are.
For example, in an
application that transfers funds from one account to another, the atomicity
property ensures that, if a debit is made successfully from one account, the
corresponding credit is made to the other account.
Consistency
Data is in a consistent
state when a transaction starts and when it ends.
For example, in an
application that transfers funds from one account to another, the consistency
property ensures that the total value of funds in both the accounts is the same
at the start and end of each transaction.
Isolation
The intermediate state of a
transaction is invisible to other transactions. As a result, transactions that
run concurrently appear to be serialized.
For example, in an
application that transfers funds from one account to another, the isolation
property ensures that another transaction sees the transferred funds in one
account or the other, but not in both, nor in neither.
Durability
After a transaction
successfully completes, changes to data persist and are not undone, even in the
event of a system failure.
For example, in an
application that transfers funds from one account to another, the durability property
ensures that the changes made to each account will not be reversed.
35. You Aren't Gonna Need It
(YAGNI)?
You Aren't Gonna Need It
(YAGNI) is an Extreme Programming (XP) practice which states: "Always
implement things when you actually need them, never when you just foresee that
you need them."
Even if you're totally,
totally, totally sure that you'll need a feature, later on, don't implement it
now. Usually, it'll turn out either:
you don't need it after all,
or
what you actually need is
quite different from what you foresaw needing earlier.
This doesn't mean you should
avoid building flexibility into your code. It means you shouldn't overengineer
something based on what you think you might need later on.
There are two main reasons
to practice YAGNI:
· You save time
because you avoid writing code that you turn out not to need.
· Your code is
better because you avoid polluting it with 'guesses' that turn out to be more
or less wrong but stick around anyway.
36. Explain what SOLID means.
The SOLID acronym features
five principles for software architect and development roles. These principles
are:
Single responsibility: "There should never be more than
one reason for a class to change." In other words, every class should have
only one responsibility.
Open/closed: The open/closed principle indicates that
although a module or class should be open for extension, it should be closed
for modification.
Liskov substitution: "Functions that use pointers or
references to base classes must be able to use objects of derived classes
without knowing it".
Interface segregation: "Many client-specific interfaces
are better than one general-purpose interface."
Dependency inversion: The dependency inversion principle
suggests that a high-level class shouldn’t rely on a low-level class, though
both can depend on high-level abstractions, [not] concretions.
37. Difference between Shared Nothing Architecture
and Shared Disk Architecture ?
Shared Nothing
Architecture |
Shared Disk
Architecture |
In a shared-nothing
architecture, the nodes do not share memory or storage. |
In shared disk
architecture the nodes share the storage. |
Here the disks have
individual nodes which cannot be shared. |
Here the disks have
active nodes which are shared in case of failures. |
It has cheaper hardware
as compared to shared disk architecture. |
The hardware of is
shared disk is comparatively expensive. |
The data is strictly
partitioned. |
The data is not
partitioned. |
It has fixed load
balancing. |
It has dynamic load
balancing. |
Scaling up in terms of
capacity is easier. For getting more space, a new node can be added to the
cluster. |
Its clustering
architecture, which makes use of a single disc device with distinct memories,
can have its memory capacity increased by upgrading the memory. |
Its advantage is that it
has high availability. |
Its advantage is that it
has unlimited scalability. |
Pros- · Easy
to scale · reduces
single points of failure, makes upgrades easier, and avoids downtime |
Pros- · It
can scale up to a fair amount of CPUs. · Each
processor possess its own memory so the memory bus is not an obstruction. · Fault
tolerance as the database is stored on discs that are accessible from all
processors so in that case other processors can take over the task if one
fails. |
Cons- · Deterioration
in performance · Expensive |
Cons- · No
scalability in the architecture because the disc subsystem’s interconnection
is currently the bottleneck. · Slower
CPU to CPU communication because of passing through a communication network. · Slow
down in the speed of current CPUs because of added more CPUs leads to the
increased competition for memory access in network bandwidth. |
38. Explain what sharding is?
Sharding is a method
software architects use to split and store one logical dataset within several
databases. Such distribution in several machines facilitates the ability to
store a bigger dataset.
39. Explain why layering an
application is vital?
There are several reasons
why it's important to divide an application into layers, such as:
Improves component
classification:
Thanks to the separation of concerns, each layer performs its own function. This
is beneficial because it becomes easier to assign every component with its own
individual classification and helps you develop effective responsibility models
and roles in your software architecture.
Low overhead costs: Unlike other software architectures, the
division of layers in an application simplifies the development process and can
significantly reduce overhead costs. Therefore, you can allocate your savings
towards fulfilling other important development operations.
Easier to write and develop
applications: When
you divide an application into layers, it becomes easier to work on them as
individual units, rather than as one large complex system. This type of process
means that you can develop the entire application more effectively.
Easier to test applications: With there being separate layers within
the software architecture, you can test each layer one at time. This is
beneficial because you can gather critical information about every layer
without mixing up any of the data.
Benefits from layers of isolation: The layers of isolation concept refers
to your ability to change one layer of the architecture, without those changes
directly affecting components within any of the other layers. This is important
because it means you can make modifications to the layers without fear of them
negatively impacting one another.
Improves problem-solving
initiatives: With
each individual layer having a singular focus, it becomes easier to solve
problems within those layers. When you solve an issue with one layer, you can
move on to addressing other problems in other layers without your solutions
causing any adverse effects.
Easier to identify errors: Because there are multiple layers for
you to analyze, it's easier for you to identify any errors within them. You can
then implement fixes to address those errors in a timely manner before they
have the opportunity to escalate.
Highly effective for
monolithic applications: Because
all the layers and functionalities exist within one architecture, is
particularly useful for monolithic applications. These types of applications
are simple to develop and you can often complete them quickly.
Well-known in the industry: Most software developers are familiar
with the layered architecture pattern, which can be beneficial for you and your
colleagues working on a project. This means that it's easy to collaborate on
with other skilled professionals.
40. Explain what cache
stampede means?
A cache stampede occurs when
several threads attempt to access a cache in parallel. If the cached value
doesn’t exist, the threads will then attempt to fetch the data from the origin
at the same time. The origin is commonly a database but it can also be a web
server, third-party API, or anything else that returns data.
One of the main reasons why
a cache stampede can be so devastating is because it can lead to a vicious
failure loop:
· A substantial
number of concurrent threads get a cache miss, leading to them all calling the
database.
· The database
crashes due to an enormous CPU spike and leads to timeout errors.
· Receiving the
timeout, all the threads retry their requests — causing another stampede.
· On and on the
cycle continues.
41. Avoiding Cache Stampede?
Cache stampede is a fancy
name for a cascading failure that will occur when caching mechanisms and
underlying application layers come under high load both at the same time in a
very short time.
Cache stampede is not
something that will be considered a problem for most of the systems out there;
it is an issue that one will encounter when reaching a certain scale and
specific data access patterns.
To give you an example
imagine you are doing an expensive SQL query that takes 3 seconds to complete
and spikes CPU usage of your database server. You want to cache that query as
running multiple of those in parallel can cause database performance problems
from which your database will not recover bringing your entire app down.
Let’s say you have a traffic
of 100 requests a second. With our 3-second query example, if you get 100
requests in one second and your cache is cold, you’ll end up with 300 processes
all running the same uncached query.
That becomes a real problem
with high load systems that will receive hundreds or even thousands requests a
second to perform an initial population of the cache (or re-populating it)
causing massive amounts of simultaneous writes to cache and a massive amount of
traffic going to underlying systems. This has the potential to cause major problems
for your infrastructure.
Avoiding Cache Stampede
In general, there are three
approaches to avoid cache stampede:
· external
recomputation,
· probabilistic
early expiration,
· locking.
External recomputation
This involves keeping an
external process or a system alive that will re-compute your values and
re-populate cache either on predefined intervals or when a cache miss happens.
The first issue is working
with a cold cache. As I mentioned above, a flood of cache misses in a short
period of time has the potential to run many expensive operations in parallel.
Continually re-populating
cache also comes with the drawback that if there is no uniform access pattern
to your data you will end up with cache bloated with useless information.
Probabilistic early
expiration
When an expiration of cache
nears, one of the processes (or actors) in your system will volunteer to
re-calculate its value and to re-populate cache with a new value.
Rails has this implemented
as :race_condition_ttl setting on the cache itself.
For this approach to be
effective you have to have at least one request per TTL; else your cache will
go cold pushing you back to square one and facing a stampede.
If you are not setting some
sort of lock to re-calculate cache while using probabilistic early expiration
then you will end up with multiple processes hammering your cache and
underlying systems computing and re-writing the same value.
Locking
This one is the simplest but
also the most versatile out of the three. As soon as a process (or an actor)
encounters a cache miss, it will set a global lock on re-computing the
requested value. This will prevent other processes from attempting the same
recompute. Once the first request computes and caches this value, it will write
to the cache, and release the lock.
It will be up to you to
decide what to do when another process encounters a cache miss and there is a
global lock on a requested key. You can make it wait, make it return nil, raise
an error, etc.
Locking has other benefits
as well. You can keep your cache TTLs low and not have to worry about storing
or serving stale data, in an effort to avoid a cache stampede.
You can also combine locking
to avoid issues with cold cache while doing external recomputation of
probabilistic expiration.
42. Explain what shared
nothing architecture is?
Shared nothing architecture
is an architecture that is used in distributed computing in which each node is
independent and different nodes are interconnected by a network. Every node is
made of a processor, main memory, and disk. The main motive of this
architecture is to remove the contention among nodes. Here the Nodes do not
share memory or storage. The disks have individual nodes which cannot be
shared. It works effectively in a high volume and read-write environment.
43. Explain what shared Disk architecture is?
Shared Disk Architecture is
an architecture that is used in distributed computing in which the nodes share
the same disk devices but each node has its own private memory. The disks have
active nodes which share memory in case of any failures. In this architecture,
the disks are accessible from all the cluster nodes. This architecture has
quick adaptability to the changing workloads. It uses robust optimization
techniques.
44. What is the “robust”
software building approach?
The Seven Components of a
Robust Software Development Process
They are as follows:
1. A steadfast
development process that can provide interaction with users by identifying
their spoken and unspoken requirements throughout the software life cycle.
2. Provision for
feedback and iteration between two or more development stages as needed
3. An instrument to
optimize design for reliability (or other attributes), cost, and cycle time at
once at upstream stages. This particular activity, which addresses software
product robustness, is one of the unique features of the RSDM, because other
software development models do not.
4. Opportunity for the early
return on investment that incremental development methods provide.
5. Step-wise
development to build an application as needed and to provide adequate
documentation.
6. Provision for risk
analyses at various stages.
7. Capability to
provide for object-oriented development.
45. What is the difference
between the “fail fast” and “robust” software building approaches?
Some people recommend making
your software robust by working around problems automatically. This results in
the software “failing slowly.” The program continues working right after an
error but fails in strange ways later on.
A system that fails fast
does exactly the opposite: when a problem occurs, it fails immediately and
visibly. Failing fast is a nonintuitive technique: “failing immediately and
visibly” sounds like it would make your software more fragile, but it actually
makes it more robust. Bugs are easier to find and fix, so fewer go into
production.
In overall, the quicker and
easier the failure is, the faster it will be fixed. And the fix will be simpler
and also more visible. Fail Fast is a much better approach for maintainability.
46. Explain heuristic
expressions?
A Heuristic Exception refers
to a transaction participant’s decision to unilaterally take some action
without the consensus of the transaction manager, usually as a result of some
kind of catastrophic failure between the participant and the transaction
manager.
In a distributed environment
communications failures can happen. If communication between the transaction
manager and a recoverable resource is not possible for an extended period of
time, the recoverable resource may decide to unilaterally commit or rollback
changes done in the context of a transaction. Such a decision is called a
heuristic decision. It is one of the worst errors that may happen in a
transaction system, as it can lead to parts of the transaction being committed
while other parts are rolled back, thus violating the atomicity property of
transaction and possibly leading to data integrity corruption.
Because of the dangers of
heuristic exceptions, a recoverable resource that makes a heuristic decision is
required to maintain all information about the decision in stable storage until
the transaction manager tells it to forget about the heuristic decision. The
actual data about the heuristic decision that is saved in stable storage
depends on the type of recoverable resource and is not standardized. The idea
is that a system manager can look at the data, and possibly edit the resource
to correct any data integrity problems.
47. Explain what cohesion
means in software architecture?
Cohesion is the degree to
how strongly related and focused are the various responsibilities of a module.
It is a measure of the strength of the relationship between the class’s methods
and data themselves. We should strive to maximize cohesion. High cohesion
results in better understanding, maintaining, and reusing components.
Cohesion is increased if:
· The
functionalities embedded in a class, accessed through its methods, have much in
common.
· Methods carry
out a small number of related activities, by avoiding coarsely grained or
unrelated sets of data.
· Related methods
are in the same source file or otherwise grouped together; for example, in
separate files but in the same sub-directory/folder.
48. Explain what coupling
means in software architecture?
Coupling is the degree to
which each module depends on other modules; a measure of how closely connected
two modules are. We should strive to minimize coupling.
Coupling is usually
contrasted with cohesion. Low coupling often correlates with high cohesion and
vice versa.
Tightly coupled modules have
the following disadvantages:
· Change in one
module might break another module.
· Change in one
module usually forces a ripple effect of changes in other modules.
· Reusability
decreases as dependency over other modules increases.
· Assembly of
modules might require more effort and/or time.
Coupling can be reduced by:
· By hiding inner
details and interacting through interfaces.
· Avoid
interacting with classes that it can avoid directly dealing with.
· Components in a
loosely coupled system can be replaced with alternative implementations that
provide the same services.
49. What is eventual
consistency?
Unlike relational database
property of Strict consistency, eventual consistency property of a system
ensures that any transaction will eventually (not immediately) bring the
database from one valid state to another. This means there can be intermediate
states that are not consistent between multiple nodes.
Eventually consistent
systems are useful at scenarios where absolute consistency is not critical. For
example in case of Twitter status update, if some users of the system do not
see the latest status from a particular user its may not be very devastating
for system.
Eventually consistent
systems can not be used for use cases where absolute/strict consistency is
required. For example a banking transactions system can not be using eventual
consistency since it must consistently have the state of a transaction at any
point of time. Your account balance should not show different amount if
accessed from different ATM machines.
50. Explain what the GOD class
is, Why should you avoid the GOD class?
The most effective way to
break applications it to create GOD classes. That are classes that keeps track
of a lot of information and have several responsibilities. One code change will
most likely affect other parts of the class and therefore indirectly all other
classes that uses it. That in turn leads to an even bigger maintenance mess
since no one dares to do any changes other than adding new functionality to it.
51. What is Unit test,
Integration Test, Smoke test, Regression Test and what are the differences
between them?
Unit test: Specify and test one point of the contract of
single method of a class. This should have a very narrow and well defined
scope. Complex dependencies and interactions to the outside world are stubbed
or mocked.
Integration test: Test the correct inter-operation of
multiple subsystems. There is whole spectrum there, from testing integration
between two classes, to testing integration with the production environment.
Smoke test (aka Sanity
check): A simple integration
test where we just check that when the system under test is invoked it returns
normally and does not blow up.
Smoke testing is both an
analogy with electronics, where the first test occurs when powering up a
circuit (if it smokes, it’s bad!)…… and, apparently, with plumbing, where a
system of pipes is literally filled by smoke and then checked visually. If
anything smokes, the system is leaky.
Regression test: A test that was written when a bug was
fixed. It ensures that this specific bug will not occur again. The full name is
“non-regression test”. It can also be a test made prior to changing an
application to make sure the application provides the same outcome.
To this, I will add:
Acceptance test: Test that a feature or use case is
correctly implemented. It is similar to an integration test, but with a focus
on the use case to provide rather than on the components involved.
System test: Tests a system as a black box.
Dependencies on other systems are often mocked or stubbed during the test
(otherwise it would be more of an integration test).
Pre-flight check: Tests that are repeated in a
production-like environment, to alleviate the ‘builds on my machine’ syndrome.
Often this is realized by doing an acceptance or smoke test in a production
like environment.
Canary test is an automated, non-destructive test
that is run on a regular basis in a LIVE environment, such that if it ever
fails, something really bad has happened. Examples might be:
· Has data that
should only ever be available in DEV/TEST appeared in LIVE.
· Has a
background process failed to run
· Can a user
logon
52. Why is it a good idea for
“lower” application layers not to be aware of “higher” ones?
The fundamental motivation
is this:
You want to be able to rip
an entire layer out and substitute a completely different (rewritten) one, and
NOBODY SHOULD (BE ABLE TO) NOTICE THE DIFFERENCE.
The most obvious example is
ripping the bottom layer out and substituting a different one. This is what you
do when you develop the upper layer(s) against a simulation of the hardware,
and then substitute in the real hardware.
Also layers, modules, indeed
architecture itself, are means of making computer programs easier to understand
by humans.
53. Explain the difference
between deadlock, livelock and starvation?
https://www.baeldung.com/cs/deadlock-livelock-starvation
In a multiprogramming
environment, more than one process may compete for a finite set of resources.
If a process requests for a resource and the resource is not presently
available, then the process waits for it. Sometimes this waiting process never
succeeds to get access to the resource. This waiting for resources leads to
three scenarios – deadlock, livelock, and starvation.
Deadlock
A deadlock is a situation in
which processes block each other due to resource acquisition and none of the
processes makes any progress as they wait for the resource held by the other
process. deadlock scenario between process 1 and process 2. Both processes are
holding one resource and waiting for the other resource held by the other
process. This is a deadlock situation as neither process 1 or process 2 can
make progress until one of the processes gives up its resource.
Conditions for Deadlock
To successfully characterize
a scenario as deadlock, the following four conditions must hold simultaneously:
Mutual Exclusion: At least one resource needs to be held
by a process in a non-sharable mode. Any other process requesting that resource
needs to wait.
Hold and Wait: A process must hold one resource and
requests additional resources that are currently held by other processes.
No Preemption: A resource can’t be forcefully released
from a process. A process can only release a resource voluntarily once it deems
to release.
Circular Wait: A set of a process {p0, p1, p2,.., pn}
exists in a manner that p0 is waiting for a resource held by p1, pn-1 waiting
for a resource held by p0.
How to Prevent Deadlock
To prevent the occurrence of
deadlock, at least one of the necessary conditions discussed in the previous
section should not hold true. Let us examine the possibility of any of these
conditions being false:
Mutual Exclusion: In some cases, this condition can be
false. For example, in a read-only file system, one or more processes can be
granted sharable access. However, this condition can’t always be false. The
reason being some resources are intrinsically non-sharable. For instance, a
mutex lock is a non-sharable resource.
Hold and Wait: To ensure that the hold-and-wait
condition never occurs, we need to guarantee that once a process requests for a
resource it is not holding any other resource at that time. In general, a process
should request all resources before it begins its execution.
No Preemption: To make this condition false, a process
needs to make sure that it automatically releases all currently held resources
if the newly requested resource is not available.
Circular Wait: This condition can be made false by
imposing a total ordering of all resource types and ensure that each process
requests resources in increasing order of enumeration. Thus, if there is a set
of n resources {r1,r2,..rn}, a process requires resource r1 and r2 to complete
a task, it needs to request r1 first and then r2.
Livelock
In the case of a livelock,
the states of the processes involved in a live lock scenario constantly change.
On the other hand, the processes still depend on each other and can never
finish their tasks.
Both “process 1” and
“process 2” need a common resource. Each process checks whether the other
process is in an active state. If so, then it hands over the resource to the
other process. However as both, the process is inactive status, both kept on
handing over the resource to each other indefinitely
A real-world example of
livelock occurs when two people make a telephone call to each other and both
find the line is busy. Both gentlemen decide to hang up and attempt to call
after the same time interval. Thus, in the next retry too, they ended up in the
same situation. This is an example of a live lock as this can go on forever.
Difference Between Deadlock
and Livelock?
Although similar in nature,
deadlock, and live locks are not the same. In a deadlock, processes involved in
a deadlock are stuck indefinitely and do not make any state change. However, in
a live lock scenario, processes block each other and wait indefinitely but they
change their resource state continuously. The notable point is that the
resource state change has no effect and does not help the processes make any
progress in their task.
Starvation
Starvation is an outcome of
a process that is unable to gain regular access to the shared resources it
requires to complete a task and thus, unable to make any progress. example of
starvation of “process 2” and “process 3” for the CPU as “process 1” is
occupying it for a long duration.
What Causes Starvation?
Starvation can occur due to
deadlock, livelock, or caused by another process. Starvation is the outcome of
a deadlock, livelock, or as a result of continuous resource denial to a
process.
As we have seen in the event
of a deadlock or a live lock a process competes with another process to acquire
the desired resource to complete its task. However, due to the deadlock or
livelock scenario, it failed to acquire the resource and generally starved for
the resource.
Further, it may occur that a
process repeatedly gains access to a shared resource or use it for a longer
duration while other processes wait for the same resource. Thus, the waiting
processes are starved for the resource by the greedy process.
Avoiding Starvation
One of the possible
solutions to prevent starvation is to use a resource scheduling algorithm with
a priority queue that also uses the aging technique. Aging is a technique that
periodically increases the priority of a waiting process. With this approach,
any process waiting for a resource for a longer duration eventually gains a
higher priority. And as the resource sharing is driven through the priority of
the process, no process starves for a resource indefinitely.
Another solution to prevent
starvation is to follow the round-robin pattern while allocating the resources
to a process. In this pattern, the resource is fairly allocated to each process
providing a chance to use the resource before it is allocated to another
process again.
54. Describe best practices
for performance testing.
1. Understand your
application:
Prior to implementing the
application, it is essential to understand the application, its capabilities
and offerings, its intended use, and the conditions where it is expected to
thrive. Additionally, the team needs to develop an understanding of the probable
limitations of the app. Listing out the common factors that might impact the
performance can be an effective practice, followed by deploying these
parameters while testing.
2. Setting realistic
performance benchmarks
Businesses often end up
developing unrealistic expectations. Hence, it is essential to set realistic
baselines by selecting practical and realistic scenarios. Teams should ensure
that the testbeds include multiple varieties of devices and environments where
the app needs to thrive. For instance, several tests are executed right from a
zero value, followed by adding load until it reaches the desired threshold.
Nonetheless, this scenario is not realistic, and often engineers get a false
picture of the system load, as the load can never reduce to nil and then
progress further from that value.
3. Configuring the
environment
In the initial stages post
the test plans, a QA team should build a toolkit of load generation and
performance monitoring tools. The testers create a bank of IP addresses that
can be leveraged during sessions. As the project proceeds, it becomes a common
practice to modify, change or expand the server performance testing toolkit for
providing a broader view of the app performance.
4. Testing early and
regularly
Performance testing has
often been a throwaway sought in the later stages of the development cycle.
However, to achieve satisfactory results from the app, performance tests must
be at the crux, executed in the initial stages, and in a proactive manner. The
earlier it is done, the better the team can identify and detect the bottlenecks
with enough time in hand to properly eliminate them. Further, it becomes more
complex and costly to implement modifications in the later stages of the
development cycle. Thus, the best practice would be to perform these tests as
part of the unit tests that will assist in quickly identifying performance
issues and rectifying the same. It is wise to incorporate an agile strategy
with agile methodologies trending today, employing iterative testing throughout
the development lifecycle. Besides, teams should allow performance unit testing
to be a part of the development process and later repeat similar tests on
broader scales across subsequent stages for evaluating the app's preparation
and maturity.
5. Understanding performance
from the point-of-view of the end-users
There is a common tendency
that performance tests focus more on the servers and clusters running software,
resulting in inadequate measurement of the human elements. Measuring the performance
of clustered servers might return a positive result, but users on a single
overloaded server might experience unsatisfactory results. Instead, it is a
better approach to also include the user's experience along with server
experiences and responses. The tests should systematically capture every user's
experience and interface timings with synchronicity to the metrics derived from
the server. Combining the user perspectives and including a Beta version of the
app can help capture the complete user experience seamlessly.
6. Performing System
Performance Tests
Applications are built on
many individual complex systems that include databases, app servers, web
services, legacy systems, and many more. While conducting app performance
testing, these systems should undergo rigorous performance testing individually
and together. This modular testing approach helps detect weak links, identify
the systems that can harm others, and determine which systems should be
isolated for further app performance optimization.
7. Building a complete
performance model
To measure the application's
performance, one needs to understand the system's capacity. This practice
involves planning what would be the steady-state concerning concurrent users,
simultaneous requests, average user sessions, and server utilization during the
peaks of the day. Furthermore, it is essential to define performance goals like
maximum response times, system scalability, good performance metrics, user
satisfaction marks, and the maximum capacity for the performance metrics.
It is also vital to define
related thresholds that will send alerts for potential performance issues as
the test passes through those thresholds. With increasing levels of risk,
additional thresholds need to be defined.
Building this complete
performance model and planning the processing should include:
· Key performance
indicators (KPIs), which include average latency, request and response times,
server utilization
· Business
process completion rate involving the transactions per second and system
throughput load profiles for average, peak, and spike tests
· Hardware
metrics that include CPU usage, memory usage, and network usage
8. Defining baselines for
critical system functions
Most often, QA systems do
not match with the production systems. In such scenarios, having baseline
performance measurements helps to provide more reasonable goals for every
environment utilized for testing. These especially provide an appropriate
starting point for response time goals with no previous metrics involved,
without having to identify and base those on other applications. Baseline
performance testing and measurement like single-user login time and response
time for individual screens should preferably be executed with no system load.
9. Consistent reporting and
result analysis
Planning, designing, and
executing performance tests are crucial but not enough. Along with these,
reports must also be an area of focus. Efficient reporting allows conveying
critical information and insights into the overall performance analyses and the
outcomes of the app's activities, especially to project managers and
developers. Analyzing and reporting consistently help in the development of
future fixes. Moreover, the developer reports must be distinct from those
provided to project managers, owners, and corporate executives.
10. Understand Performance
Test Definitions
It’s crucial to have a
common definition for the types of performance tests that should be executed
against your applications, such as:
Single User Tests. Testing with one active user yields the best
possible performance, and response times can be used for baseline measurements.
Load Tests. Understand the behavior of the system
under average load, including the expected number of concurrent users
performing a specific number of transactions within an average hour.
Peak Load Tests. Understand system behavior under the
heaviest anticipated usage for concurrent number of users and transaction
rates.
Endurance (Soak)
Tests. Determine
the longevity of components, and whether the system can sustain average to peak
load over a predefined duration. Monitor memory utilization to detect potential
leaks.
Stress Tests. Understand the upper limits of capacity
within the system by purposely pushing it to its breaking point.
High Availability Tests. Validate how the system behaves during a
failure condition while under load. There are many operational use cases that
should be included, such as seamless failover of network equipment or rolling
server restarts.
Some additional practices
for mobile applications
· Considering
network quality as latency tends to be higher on mobile networks with
unpredictable connections.
· Considering the
entire mobile product families' performance and available resources often vary
within product families (like iOS, iPhone 12, and iPhone 12 mini) and even more
with android devices.
· Tests must be
device-agnostic.
· Using emulation
to a certain extent when it might not be feasible to install the app on the
actual devices and fit multiple demands of several actual devices.
· Deploying
end-to-end tests as mobile apps are only as good as their back-end server
response time plus their own processing time.
· Expecting
higher user expectations and more concurrent users.
· Performing
capacity testing, including the low memory and out-of-storage conditions.
55. Describe three metrics that measure performance testing?
You need to be able to make
sense of what has and hasn't been tested and report on that work to
stakeholders, who may not all be technologically inclined. Some people love
monstrous spreadsheets. Most, however, want summarized or visualized data that
answers a few key questions.
So, think about how to plan,
execute and report on your performance testing efforts in a meaningful way. Ask
yourself these three questions:
1. Can it go faster?
The efficiency of any
software application is key to its success. According to Neil Patel and
Kissmetrics: 'If an e-commerce site is making $100,000 per day, a one second
page delay could potentially cost you $2.5 million in lost sales every year.'
These performance testing
metrics include:
· Average load
times
· Response times
· Hits/connections
per second
· Network bytes
total per second
· Network output
queue length
· Throughput for
received requests
· Garbage
collection
Remember: don't just focus
on averages. An average is only useful when taken in conjunction with the
standard deviation across datapoints, or even better, as a percentile. For
example, 'average page load time is less than 0.1 seconds, 99 percent of the time.'
2. Can it go farther?
How many resources does a
certain function use? Can the number of users scale up or down? Think of it as
if you were buying a kid new shoes - does the application have room to grow,
and at what point will you reach catastrophic failure (and the shoe seams
burst!)?
These performance testing
metrics include:
· Top wait times
for retrieving data from memory
· Bandwidth (bits
per second)
· Memory/disk
space/CPU usage
· Amount of
latency/lag
· Concurrent
users
· Private bytes
You're looking for
bottlenecks that slow or halt performance, usually caused by coding errors or
database design, although it's worth noting that hardware may contribute to
these as well. Running endurance tests reveals how these issues appear over
time, even if they don't turn up initially.
3. Can it go forever?
Consistency is key. A
software application should work the same way every time. You want to measure
how stable or reliable it is over time, especially in the face of spikes in
usage or other unexpected events.
These performance testing
metrics include:
· Page faults
(per second)
· Committed
memory
· Maximum active
sessions
· Thread counts
· Transactions
passed or failed
· Error rate
All types of software
testing are, really, about finding breaking points. This is never more
important than if you experience rapid demand, such as an increase in
popularity. Will your software be able to cope with a surge of new users, such
as those seen by Netflix and TikTok in 2020? If not, you risk missing out on a
big opportunity.
Start going faster, farther,
forever... earlier?
Performance testing should
not be the last stage in development. By anticipating issues early on and
planning ahead, you save yourself the headache of fixing performance problems
uncovered during testing. That's why you'll want to engage a quality assurance
engineer at the planning phase of your next new build or feature.
'Only conducting performance
testing at the conclusion of system or functional testing is like conducting a
diagnostic blood test on a patient who is already dead.'
— Scott Barber, Performance
Architect
56. Behavioral software
architect interview questions and answers?
A. How do you stay up to date
with the latest developments in the software architect field?
Some of the best approaches
to stay up to date with the latest developments in the software architect field
include:
· Reading
technical books
· Working on side
projects
· Reading blogs
· Completing
courses
B. What are your favorite
programming languages?
Each candidate may have a
different response to this question, or they may not have a clear favorite. But
it’s vital that your candidates can give rational and clear explanations for
their choices.
For instance, if they don’t
have a favorite language, they may explain that certain languages are better
for particular projects.
C. Have you ever failed when
completing a project? What did you learn from the failure?
Each of your candidates will
likely have experienced a time when they couldn’t complete a project. But they
should have learned from the failure. For example, a candidate may describe a
project they managed that was particularly big and complicated.
They may have had to
coordinate between several teams, and although the project wasn’t as successful
as hoped they may have learned valuable techniques to handle complex
coordination.
D. What is your approach to
task delegation?
It’s essential to get the
right balance between delegating all tasks and completing every task without
team support. Individual initiative is vital, but so is relying on your team.
The candidates you should
watch out for are those who explain clearly that keeping an eye on the team and
the tasks that have been delegated is important.
E. Which features do you
dislike about your favorite programming language?
Candidates may respond in a
variety of ways to this question. But in general, the more limited their
response, the lower their level of expertise is likely to be.
For example, suppose a
candidate points out that there are whitespace delimitations to Python code
blocks. In that case, they may not fully understand the complexities of this
programming language’s style and philosophy.
https://www.maixuanviet.com/question/category/software-architecture
The 12 Factors
The following are the 12 factors that we are going to see in
detail with illustrations.
1.
Codebase
2.
Dependencies
3.
Config
4.
Backing Services
5.
Build, release, and Run
6.
Processes
7.
Port Binding
8.
Concurrency
9.
Disposability
10. Dev/prod
parity
11.
Logs
12. Admin
processes
1. Codebase
“One codebase tracked in revision control, many deploys”
A microservice must consist of a single repository, that is
tracked by a version-control system like Git (including GitHub, GitHub
Enterprise, GitLab, BitBucket, etc). If there are 20 microservices in an
application, then there must be 20 individual repositories.
The single repository helps development teams to develop the
microservices independently and they can be language agnostic (i.e they can be
developed in different languages). This will support collaboration between teams
and helps to enable proper versioning of applications.
2. Dependencies
“Explicitly declare and isolate the dependencies”
Most applications require the use of external dependencies to build the code and run it. Rather than packaging the dependencies inside the microservice application, these dependencies must be pulled during the build process. This will simplify the setup required for any new developers and also provide consistency between development, staging, and production environments.
3. Config
“Store configurations in an environment”
Configuration refers to any value that can vary across
deployments (e.g., developer machine, Dev, QA, production, etc). This might be
of the following
· URL and credentials to connect to databases
· URLs and other info for other services like Logs, Caching, etc
· Credentials to third-party services such as Amazon AWS, Payment Providers, etc
There must be a clean separation of configuration and the code. If we ship the config with the code, it will fail in other environments and also there is a risk of credentials leak. Externalizing configuration is very important for a microservice application and it enables us to deploy our applications to multiple environments regardless of which runtime we are using.
4. Backing Services
“Treat backing resources as attached resources”
Backing services are any processes that the microservice
communicates over the network during the operation. Examples include Databases
(e.g. MySQL, PostgreSQL), Cache, Message broker, etc.
The Backing Services principle encourages architects and
developers to treat components such as databases, email servers, message
brokers, and independent services that can be provisioned and maintained as
attached resources.
The resource can be swapped at any given point in time without impacting the service. Eg: Let us say that you are using the MySQL database in AWS and would like to change to Aurora. Without making any code changes to your application you can do it just with a config change.
5. Build, release, and Run
“Strictly separate build and run stages”
The deployment process for a microservice application is
divided into 3 stages namely Build, Release, and Run.
In Build stage, the source code is retrieved from the
source code management tool like Git and then the dependencies are gathered and
bundled into the build artifact (e.g. a WAR or JAR file). The output of this
build phase is a packaged server containing all of the environment-agnostic
configurations required to run the application.
The Release stage retrieves the artifacts from the
build stage and applies configuration values (both environmental and
app-specific) to that to produce another release. This release is tagged By
labeling these releases with unique IDs which will help to roll back to the
previous version of the deployment if needed.
The last stage is the Run stage, which usually occurs on the cloud provider, and usually uses tooling like containers or Terraform/Ansible to launch the application. Finally, the application and its dependencies are deployed into the newly provisioned runtime environment like Kubernetes Pods or EC2 server or Lambda functions, etc.
The Build, Release, and Run is completely ephemeral which
means that all artifacts and environments can be reconstructed from scratch (if
anything goes wrong in the pipeline) using assets stored in the source code
repository.
6. Processes
“Execute the app as one or more stateless processes”
Twelve-factor processes are stateless and share-nothing. Any
data that needs to persist must be stored in a stateful backing service,
typically a database.
The principle of Processes states that the 12 Factor App
should be stateless and it should not store the data in-memory of the
application. The sticky sessions should not be used as well. The memory space
or filesystem of the process/application can be used only as a temporary one.
For example, downloading a large file, operating on it, and
storing the results of the operation in the database. Any data to be stored
should be stored in a stateful backing service like a database or distributed
cache that can be read by all the instances of the application.
When a process is stateless, instances can be horizontally
scaled up and down, and having statelessness will prevent any unintended side
effects.
7. Port Binding
“Export services via port binding”
The twelve-factor app is completely self-contained and does
not rely on the runtime injection of a webserver into the execution environment
to create a web-facing service. The web app exports HTTP as a service by
binding to a port, and listening to requests coming in on that port.
The principle of Port Binding states that a service or
application should be identifiable on the network by its port number and not a
domain name. This is because the domain names and IP addresses can change
dynamically due to the automated discovery mechanisms and hence using them is
not reliable. Using a port number will always make the microservice
identifiable and more reliable.
Port number is the best way to expose a process to the
network. Eg: For example, port 80 is used for web servers port that uses HTTP,
port 443 is the default port number for HTTPS, port 22 is for SSH, port 3306 is
the default port for MySQL, and port 27017 is the default port for MongoDB.
Likewise, all our microservices must have their unique port numbers defined and
the applications must run on the designated ports.
8. Concurrency
“Scale out via the process model”
The principle of Concurrency talks about scaling the
application. Twelve-factor app principles suggest running your application as
multiple processes/instances (Horizontal Scaling) instead of running in one
large system i.e Vertical scaling. By adopting containerization, applications
can be scaled horizontally as per the demands.
An application is exposed to
the network via web servers that operate behind a load balancer. Then the web
servers communicate with the Business Service that runs behind another load
balancer. In case of load, the Business layer can be scaled up independently.
If concurrency is not supported in the architectures like Monolith, then the
entire application must be scaled up.
9. Disposability
“Maximize the robustness with fast startup and graceful shutdown”
When the application is shutting down or starting up, an instance should not impact the application state.
Applications should aim to minimize startup time and the
process must take only a few seconds to launch and be ready to receive requests
or jobs.
Applications can crash due to various reasons and the system should ensure that the impact would be minimal and that the application should be stored in a valid state.
By adopting containerization into the deployment process of
microservices, we can make the application disposable. Docker containers can be
started or stopped instantly. Storing request, state, or session data in queues
or other backing services ensures that a request is handled seamlessly in the
event of a container crash.
10. Dev/Prod parity
“Keep development, staging, and production as similar as possible”
The dev/prod parity principle focuses on the importance of keeping development, staging, and production environments as similar as possible. This will eliminate unexpected issues when the code is deployed to production.
Tools like Docker are very helpful to implement this dev/test/prod parity. The benefit of a container is that it provides a uniform environment for running code by packaging the code and the dependencies required to run the code in the form of a docker image. Containers enable the creation and use of the same image in development, staging, and production. It also helps to ensure that the same backing services are used in every environment.
11. Logs
“Treat logs as event streams”
Logs are the stream of aggregated, time-ordered events
collected from the output streams of all running processes and backing
services. Logs in their raw form are typically a text format with one event per
line. Logs have no fixed beginning or end but flow continuously as long as the
app is operating.
The logs factor highlights the importance of ensuring that your application doesn’t concern itself with routing, storage, or analysis of its output stream. In cloud-native applications, the aggregation, processing, and storage of these logs is the responsibility of the cloud provider or other tool suites (e.g., ELK stack, Splunk, Sumologic, etc.) running alongside the cloud platform being used. This factor helps to improve flexibility for introspecting behavior over time and enables real-time metrics to be collected and analyzed effectively over time.
12. Admin processes
“Run admin/management tasks as one-off processes”
There are a number of one-off processes as part of the application deployment like data migration, and executing one-off scripts in a specific environment.
Twelve-factor principles advocates for keeping such administrative tasks as part of the application codebase in the repository. By doing so, one-off scripts follow the same process defined for your codebase and they should be run in an identical environment as the regular long-running processes of the app.
They run against a release, using the same codebase and config
as any process run against that release. Admin code must ship with application
code to avoid synchronization issues.
No comments:
Post a Comment