When systems evolve the need to migrate database does arise based on the data we are dealing with. In one of our projects, we had a scenario where non-transactional data had to be accessed frequently with low latency across regions. Azure Cosmos DB is Microsoft’s fast NoSQL database and the first globally distributed database service in the market today to offer comprehensive service level agreements encompassing throughput, latency, availability, and consistency.

So, the choice was clear for us to move the data from PostgreSQL to CosmosDB. In PostgreSQL the data format is flat (i.e., in the form of tables and columns), while CosmosDB supports flexible schemas and hierarchical data, and thus it is well suited for storing catalog data. Provides. JSON format supported by Cosmos DB is an effective format that is very lightweight.

For the data migration from PostgreSQL to CosmosDB we chose Azure Data Factory. Azure Data Factory helps you integrate, perform transformations, and visualize all your data with ease. Azure Data Factory is easy to use, cost-effective, a fully serverless cloud service that accelerates data transformation with code-free data flows.

Data Transformation using Azure Data Factory

Azure Data Factory is an orchestration tool that is used to transform the data from one source to another. It can process and transform the raw data into predictions and insights. It allows you to perform data transformation activities via pipelines. Data flows are created in debug mode to validate the logic of the transformation. The data flow activity is added to the pipeline to execute and test the data flow. And “trigger now” is employed to test the data flow that is in the pipeline.

Several options are available for converting JSON data to flat data, however when it comes to converting flat data into JSON, it is still a challenge. With JSON, inserting null value within the data flow is quite tedious and a unique pipeline must be created to handle null values.

The Azure Data factory has three main options:

  • Author
  • Monitor
  • Manage

The Author option provides the main environment for development. Using this option, we can design and manage Azure Data Factory resources such as the pipelines and dataflows.

The Monitor option allows you to monitor the pipelines; trigger runs; sessions; the time taken for execution and running the pipelines; check if the execution of the pipeline was a success or failure and set up alerts.

The Manage option allows you to manage the connections, link service, source control, triggers, parameters, and security.

Now let’s look at how to migrate from PostgreSQL to CosmosDB and transform the data.

Pipeline and activities creation

The Author option provides the environment for the development of pipelines and data flows. So, the first step is to create a pipeline that contains the data flow activity. The pipeline is a logical grouping of activities that perform a task. By clicking on orchestrate on the home page, we create and name the pipelines.

The activities pane allows you to perform various activities such as move, transform, add data flow (we can use existing data flows or even create new data flows), Azure data explorer, Azure function, Batch Service, Databricks, Data Lake Analytics, etc. Once the data flow is created, we can provide the transformation logic in the data flow canvas. The dataflows distribute the processing of data over different nodes in a Spark cluster to perform the operations parallelly. We need to create a mapping data flow to perform the transformation as well.

We can choose the format of the data as per the requirement. We can also select the dataset, which is simply the view of data or the references of the data that you want to employ in your activity.

In our data migration scenario, say for an activity to copy data from the source (PostgreSQL data directory), the data is taken from the source and put in the blob storage. The basic details are stored in the first copy activity, the patient color details are stored in the second, microchip details are stored in the third, breed details in the fourth and existing patient details are stored in the fifth copy activity respectively. The data flow performs transformations to the data such as join, conditional join, aggregate, pivot, flatten, union, split, etc. And finally, triggers are scheduled for doing transformation in the pipelines.

Linked Service and Integration Runtime

When creating a dataset, first we must create a linked service to link the data store to Data Factory via the management hub. Linked services are much like connection strings and defines the connection to the data source.

In our data migration scenario, we have created the following linked services for Azure Managed Instance, Azure Blob storage, CosmosDB (one for dev and one for UI), and PostgreSQL. To create the linked service, we must provide the server’s name, port, database name, username, password, etc. These are the input for dataflow to connect to the specific database. In this migration, the PostgreSQL data is migrated from a different VM environment into the Azure environment.

Here, the base directory is called containers, inside the containers we have several directories which further contain storage files.

The integration runtime via the management hub is the compute infrastructure for providing data integration capabilities such as data flow, data movement, activity dispatch, and SSIS package execution across different network environments. It provides the linkage between the activity and linked services. The default option is public for the Auto-resolve integration runtime. We can also create private customized ones based on the requirement for dataflow execution.

We hope this walkthrough of migrating data from PostgreSQL to Azure CosmosDB using Azure Data Factory was helpful. If you have queries or want us to help you with your data migration projects, feel free to reach out to us.

Angular is a component-based application design framework for building scalable, reactive, and efficient single-page apps. It provides a wide range of tools and well-integrated libraries to build, develop and test your applications. The angular applications are written by composing HTML templates and components are created to manage these HTML templates. The logic of the applications is given in the services, these services and components are wrapped into the modules.

Angular Evolution

Angular framework was first introduced a decade ago in 2010 under the name Angular JS. Through the course of years, the framework has developed with various updates. AngularJS 1.x is seen as a JavaScript-based framework that creates rich web and mobile applications. The Model-View-Controller (MVC) and its variations Model-View-Presenter (MVP) and Model-View-View-Model (MVVM) have been considered as an Angular JS architecture. Even though Angular JS is a proven architectural pattern and is the standard solution for an application with a view, there are certain disadvantages.

  • Memory leakage.
  • Internet Explorer 8.0 doesn’t support Angular JS.
  • AngularJS totally depends on JavaScript.

Angular 2+ was typescript-based free open-source technology used to develop web applications and mobile applications. It is a component-based framework and has a collection of well-integrated libraries. Angular 2+ is very simple and pretty straightforward to use. Angular 2+ is typescript, which allows you to validate the code with ease and show the error at the time of typing. Implementing forms and validation are much simpler and more effective with angular 2+. With recent advancements, Agular’s responsive design has shifted towards a mobile-first approach.

Angular 3 was skipped because the framework was developed in a MonoRepo, which is a single repo, and its router package was already in the third version. To avoid confusion in terms of dependency, Angular third version was skipped.  Angular 4 was released in 2017 which was compatible with both typescript 2.1 and 2.2.  This version improved the speed and performance by a good extent. Angular 5 was released around the end of 2017 and it was comprised of many new features and improvements. It provided an optimizer that removed unnecessary code from the applications.           It also provided an improved compiler and improvements in angular universal for code allocation, and most important of all it supported typescript 2.4. Angular 6 was released in early 2018 which introduced Angular elements and Angular Rendering Engine.  Angular 7 was released later that year with major performance improvements. It also provided a drag and drop module, angular material, and component Dev kit. Angular 8 was released in 2019 with its dependencies updated, improved web worker bundling, and provided a new lazy loading syntax and also includes angular firebase.

Angular 9 was introduced in 2020, it provided new features and updates such as more consistent ng, updated and improved API extractor, dependency injection update,  better speed and performance, AOT build ensures a faster and better performing compiler, supports typescript 3.7.  Angular 10 was released in June 2020 with new updates and features such as a new data range picker, an updated compiler, Optional stricter settings to catch bugs ahead, performance improvements, supports typescript 3.9.

Angular 11 was released in November 2020, major improvements are router performance, automatic inlining of fonts, updated Hot Module Replacement(HMR), provides faster builds and improved performance, supports Typescript 4 and Webpack 5, component test harness with parallel functions and improved performance. Angular 12 was released on May 2021, with major improvements in styling, supports Typescript 4.2 and Webpack 5.3.7,  nullish coalescing for writing better and cleaner code in typescript classes, improved ng, includes protractor, new dev tools, Migrating from legacy i18n message IDs. Angular 13 is also available with impactful changes towards optimizations. Angular 13 no longer has the view engine, with reduced dependency on ngcc, we can hope for faster and improved compilation.  Angular 13 supports typescript 4.4 and provides an improved and modernized Angular Package Format,  improved Angular CLI and it no longer supports IE11.

Why Angular?

With the evolution of Angular over the years, it has become one of the most recommended for businesses and enterprises for various reasons. Angular is used to develop single-page client applications using HTML and TypeScript. Angular offers two-way data binding and it shares the data between Model as well as View. Hence, when data is modified or changed, components get updated automatically in real-time. Code reusability in an angular application is high.

Angular makes a great recommendation to businesses as it provides a framework that can work well with back-end languages and also very well combines the business logic and UI. Angular provides an effective cross-platform development framework that makes the development process easier with reduced cost. Though initially, it is a little complex to learn, it is worth it as it provides high-quality applications.   Angular provides Typescript that facilitates the developers to write clean and neat code, which makes the fixing of the bugs that much easier. The framework is structured around component-based development, which aids in a steady development process with consistent and high reusability of components. This further facilitates in providing better maintainability and productivity.

The evolution of angular has provided various new features and improvements that have significantly increased the speed and optimized the performance. The older versions had larger bundles size which hindered the fast loading of the applications, but with recent improvements such as lazy load modules and Ivy Renderer, we can create lightweight web applications that are faster and better. To overcome productivity issues and to provide a faster development process, features such as dependency injection and angular services are also provided. Angular keeps ever-evolving based on requests from google and the angular community making it one of the ideal frameworks for businesses and enterprises.

 Angular Architecture and it’s core components

The angular architecture contains the following core components

  • Module
  • Meta Data
  • Directives
  • Pipe
  • Service
  • Decorators

Module

Module in Angular refers to a place where you group the components, directives, pipes, and services, which are related to the application. Every Angular app has a root module that has the bootstrap mechanism to launch the app.

Meta Data

Metadata is used to decorate a class so that it can configure the expected behavior of the class. Metadata is attached to TypeScript using the decorator.

Directives

Directives are classes that are used to change the behavior or view of DOM elements in Angular applications. 

Three types of Angular directives are as follows:

  1. Components – directives with a template.
  2. Attribute directives – directives that change the appearance or behaviour of an element, component, or another directive.
  3. Structural directives – directives that change the DOM layout by adding and removing DOM elements.

Pipe

The pipe is asimple way to transform values in an Angular template. Some of the built-in pipes in angular are CurrencyPipe, DatePipe, JsonPipe, LowerCasePipe, UpperCasePipe, PercentPipe, SlicePipe.

Service

 Services are used to access and share the methods and properties with the other components in the entire project. It reduces the function and properties repetitions in the entire project. HTTP requests and responses will be handled in the services.

There are two types of services in angular.

  • Built-in services – There are approximately 30 built-in services in angular.
  • Custom services – In angular, if the user wants to create their own service, they can opt for custom services.

Decorators

Angular decorators are used to storing metadata. It is about a class, method, or property. There are four types of decorators in Angular:

  1. Class Decorators
  2. Property Decorators
  3. Method Decorators
  4. Parameter Decorators

1. Class Decorators

Class Decorators are top-level decorators. It defines the purpose of the class.

2. Property Decorators

Property decorators are used to specific properties within the class. @Input() is the example for the property decorator.

3. Method Decorators

A Method Decorator is used to specify the methods within your class with functionality. @HostListener() is an example of a method decorator.

4. Parameter Decorators

Parameter decorators are used to decorating parameters. It is used in the class constructors. @Inject() is the example for the parameter decorator

Testing 

Angular uses the Jasmine testing framework. It provides multiple functionalities to test. Karma is the task-runner, uses a configuration file to set the start-up, reporters, and testing framework.

Limitations of Angular

Angular though it is one of the popular modern-day frameworks that is employed, it also has a few limitations:

  • Steep learning curve

Though Angular is a great framework, it’s sometimes is quite difficult even for people with experience in HTML, JS, CSS, and for people who are not much used to n-tier architecture. They can find it hard to learn some of the concepts as it has its own set of rules, which might be difficult and uncomfortable for novice learners.

  • Limited SEO options

Though Angular is a great and powerful platform to build single-page applications, limited SEO options and poor accessibility to search engine crawlers are some of the major drawbacks. This makes it difficult to place the website correctly in the list provided by the search engines.

  • Complex and Verbose

Angular has such a complex framework of modules and complex capabilities for integration and customization. Though angular provides an array of online tutorials and documentation, it’s quite uncomfortable to learn in the beginning. A slow and steady pace is recommended to learn the platform and language.

  • Complex directives

Angular has three directives -attribute directive, structural directive, and component directive. The three directives have their own limitations and it’s quite complex for beginners to understand when to use what.

Angular Prerequisites

Angular requires the following prerequisites – NodeJS, Angular CLI, and Text Editor

Supported by Google

Google is offering a Long-Term Support (LTS) for Angular to scale up to enterprise Angular applications development. Netflix, Gmail, YouTube TV, Upwork, and other organizations are also using the Angular framework for application development.

Open-Replay

Open Replay is an open-source tool, developers can integrate this with the angular applications. When the user uses the application, open-replay will track it. So the developers can easily come to know how much the application is user-friendly. It has the network tracker, redux, NgRx action tracker, and profile tracker plugins. It also captures the performance of the application.

With this article we wanted to give you an introduction to Angular, the evolution of angular over the decade, the need for angular in today’s business world, and its core architecture. We covered the core components of the architecture and also the limitations of the framework. Bottom line is that Angular has become one of the best frameworks for developing single page apps.

Azure App Service is a Platform as a Service (PaaS) that is used to build, deploy, and scale enterprise-grade applications such as web apps, mobile apps, logic apps, API apps, and function apps. It supports multiple programming languages and frameworks such as NET, .NET Core, Java, Ruby, Node.js, PHP, Python.

From a developers perspective Azure App Service provides a great platform to develop, deploy and scale applications.  However, when it comes to production environments Infrastructure as code (IaC) comes in handy. Terraform is an open-source IaC tool with a consistent CLI that lets you write infrastructure as code using declarative configuration files and also, manage, plan and apply changes to infrastructure versions to reach the required configuration state.

Terraform is a good choice as it reduces manual human error by codifying the application infrastructure. Terraform manages infrastructure across more than 300 public clouds and it provides a reusable, cost-effective, and consistent environment that solves dependencies and version controls. In this article, we will take you through the process of deploying a web app in Azure App Service using Terraform.

To deploy the web app in Azure App Service using Terraform, here are the steps we need to follow:

  • Create the Resource Group
  • Create App Service plan and deploy web app

Create the Resource Group:

The first step is to create a resource group using the following terraform code. Any resource that is created must be created within the resource group

terraform {    
  required_providers {    
    azurerm = {    
      source = "hashicorp/azurerm"    
    }    
  }    
} 
   
provider "azurerm" {    
  features {}    
}

resource "azurerm_resource_group" "resource_group" {
  name     = "app-service-rg"
  location = "East US"
}

Create App Service plan and deploy web app

The App Service plan defines the capacity and resources to be shared among one or more app services that are assigned to that plan. Azure WebApp must be associated with an App Service Plan as it specifies the computing resources that are required for the web app to function. The following code creates an app service plan.

resource "azurerm_app_service_plan" "app_service_plan" {
  name                = "example-appserviceplan"
  location            = azurerm_resource_group.resource_group.location
  resource_group_name = azurerm_resource_group.resource_group.name

  sku {
    tier = "Standard"
    size = "S1"
  }
}

And add code for creating app service. Finally, the terraform file looks like the below

terraform {    
  required_providers {    
    azurerm = {    
      source = "hashicorp/azurerm"    
    }    
  }    
} 
   
provider "azurerm" {    
  features {}    
}

resource "azurerm_resource_group" "resource_group" {
  name     = "app-service-rg"
  location = "East US"
}

resource "azurerm_app_service_plan" "app_service_plan" {
  name                = "myappservice-plan"
  location            = azurerm_resource_group.resource_group.location
  resource_group_name = azurerm_resource_group.resource_group.name

  sku {
    tier = "Standard"
    size = "S1"
  }
}

resource "azurerm_app_service" "app_service" {
  name                = "mywebapp-453627 "
  location            = azurerm_resource_group.resource_group.location
  resource_group_name = azurerm_resource_group.resource_group.name
  app_service_plan_id = azurerm_app_service_plan.app_service_plan.id

  #(Optional)
  site_config {
dotnet_framework_version = "v4.0"
    scm_type                 = "LocalGit"
  }
  
  #(Optional)
  app_settings = {
    "SOME_KEY" = "some-value"
  }

}

Now, we should run the following command to initiate terraform.

Command: terraform init

To create an execution plan, we should run the terraform plan command

Command: terraform plan -out appservice.tfplan

To apply the plan, run the following command

Command: terraform apply ” appservice.tfplan “

We can verify the app service created in the specified app service plan and resource group by checking in the Azure portal

Hope you found this article useful. Stay tuned for more articles coming up on Azure App Service and Terraform.

With an average cost of a data breach at $3.86 million last year, it’s wise to employ a good backup system. More than 80% of the fortune 500 companies use Microsoft Azure for running their businesses effortlessly as they are simple, ever-evolving, secure and cost effective. So, in this article let’s explore how to backup and restore Azure Managed Disks using Azure Backup Vault.

Azure Backup services allow you to back up your data and recover it from the Microsoft Azure cloud. They backup and store the data in the backup vaults. These backup vaults make certain that the backups are successful by monitoring and tracking the storage containers, they optimize the resources by automating maintenance tasks and they also provide better security and access control to store and recover data.

Data sources that are supported by Azure Backup include

  • Azure Database for PostgreSQL servers,
  • Azure Blobs, and
  • Azure Disks.

Prerequisites for performing disk backup and restore operations

Backup Vault’s managed identity needs the below roles to be assigned to it for performing disk backup and restore operations:

  • Disk Backup Reader role on the Source disk that needs to be backed up.
  • Disk Snapshot Contributor role on the Resource group where backups are created and managed by Azure Backup.
  • Disk Restore Operator role on the Resource group where the disk will be restored by the Azure Backup.

To assign Azure roles, the user must have Microsoft.Authorization/roleAssignments/write permissions, such as User Access Administrator or Owner.

Steps to backup managed disks:

Create a Backup vault

  1. Go to Backup center service in Azure portal. Backup center enables enterprises to govern, monitor, operate, and analyze backups at scale. Jobs performed in last 24 hours are displayed in the Overview tab. Operations such as Scheduled backup, On-demand backup and Restore are listed along with the status of each operation (Failed, In progress or Completed)

2. Select Vault from the Overview tab

3. In Start: Create Vault page, select Backup vault and then Continue

4. In Basics tab,

  • Under PROJECT DETAILS, select the Subscription and Resource group of the vault to be created.
  • Under INSTANCE DETAILS, type in the Backup vault name.
  • Select the region of the backup vault and backup storage redundancy
  • Select Review and create

5. Select Create. The Backup vault will be created.

Create a backup policy

  1. Select Policy from Backup center’s Overview tab

2. In Start: Create Policy page, select Datasource type as Azure Disks and Vault type is prepopulated as Backup vault. Then, select Continue.

3. In the Basics tab, type in the policy name to be created. Select Datasource type as Azure Disk and select Vault as the Backup vault that was just created. Click on Next: Schedule and Retention to go to next tab.

4. In the Schedule and retention tab,

  • Under Backup schedule, select the backup schedule frequency and specify the time when backup must happen.
  • Specify the number of days backup should be retained under Retention settings.

5. After validation, in Review and Create tab, select Create. The Backup policy is created.

Configure backup of an Azure Disk

To backup an azure disk,

  1. Assign Disk Backup Reader role on the Source disk that needs to be backed up to the Backup vault’s managed identity.
  2. Assign Disk Snapshot Contributor role on the Snapshot Resource group to the Backup vault’s managed identity.

a)     Steps to Assign Disk Backup Reader role on the Source Disk

  1. Go to the source disk that we need to configure backup for.
  2. Select access control (IAM) and Add role assignment

3. In Role tab, search for the role Disk Backup Reader and select it

4. In Members tab, select assign access to Managed identity and select members as the Backup vault.

5. Select Review+ assign. Assignment of Disk Backup Reader role to the Backup vault is done.

b)     Steps to Assign Disk Snapshot Contributor role on the Snapshot Resource group

  1. Go to the target Snapshot resource group.
  2. Select access control (IAM) and Add role assignment

3. In Role tab, search for the role Disk Snapshot Contributor and select it

4. In Members tab, select assign access to Managed identity and select members as the Backup vault.

5. Select Review+ assign. Assignment of Disk Snapshot Contributor role to the Backup vault is done.

Steps to Backup an Azure Disk

  1. In Backup center, Select Backup from the Overview tab

2. In Start: Configure Backup, select Datasource type as Azure Disks and Vault type is prepopulated as Backup vault.

3. In Basics tab, select Datasource type as Azure Disks,select the Vault created and then Next.

4. In Backup policy tab, Select the backup policy.

5. In Datasources tab:

a) Click on Add/Edit and select the disks to backup.

b) Click Select after selection of disks.

c) Select Snapshot Resource Group, the resource groupwhere snapshots of disks are stored. Once the disk backup is configured, the Snapshot Resource Group that’s assigned to a backup instance cannot be changed.

d) Select Validate. Click Next.

6. In Review and Configure tab, Select Configure Backup. The configuration of backup for the disk is done.

For the validation to be successful, We must assign Disk Backup Reader role on the Source disk that needs to be backed up to the Backup vault’s managed identity and Disk Snapshot Contributor role on the Snapshot Resource group to the Backup vault’s managed identity.

On demand backup of an Azure Disk

  1. In the Backup Vault, go to Backup Instances and select the disk to perform on demand backup

2. Select Backup Now.

3. In Backup vault, go to Backup jobs to view the status of the backup.

Restore an Azure Disk from backup

To restore an azure disk, we need to assign Disk Restore Operator role to Backup vaults managed identity on the Resource group where the disk will be restored by the Azure Backup.

Steps to Assign Disk Restore Operator role on the Target Resource group

  1. Go to the target resource group.
  2. Select access control (IAM) and Add role assignment

3. In Role tab, search for the role Disk Restore Operatorand select it

4. In Members tab, select assign access to Managed identity and select members as the Backup vault.

5. Select Review+ assign. Assignment of Disk Restore Operatorrole to the Backup vault is done.

Steps to Restore an Azure Disk

  1. Go to Backup center -> select Backup vault -> select Restore

2. In Basics tab, Select the Backup instance as the disk that needs to be restored and then Next.

3. In Select Restore Point tab, select the required or latest restore point.

4. In Restore parameters, select Target subscription, and Target resource group. Type in the Restored disk name and select Next:Review and restore.

5. After validation, click on Restore. The restore operation is started.

Note: If validation is unsuccessful, follow the steps to Assign Disk Restore Operator role on the Target Resource group.

6. Restore operation is now completed.

While adopting DevOps practices automates and optimizes processes through technology, it all starts with the culture inside the organization—and the people who play a part in it.

Check out this infographic to learn how DevOps unifies people, process, and technology to bring better products to customers faster. Then imagine how the power of GitHub and Azure can benefit your DevOps team.

Together, Microsoft GitHub and Azure DevOps provides an end-to-end experience for development teams to easily collaborate while building and releasing code to Azure, on-premises or any cloud. Contact us today to learn more.

This infographic offers an in-depth look at how Microsoft business analytics and AI is intelligent, trusted, and flexible. This service produces faster, more accurate insights and predictions. It also offers the most secure, compliant, and scalable system. Finally, it works with what you have.

Would you like to leverage Microsoft Business Analytics and AI for faster, more accurate insights and predictions? At CloudIQ Technologies, we have knowledgeable and professional team ready to help you. Contact us today to learn more.

Seattle [23 Jun 2021] – CloudIQ Technologies Inc today announced it has earned the Kubernetes on Microsoft Azure advanced specialization, a validation of a solution partner’s deep knowledge, extensive experience and proven expertise in deploying and managing production workloads in the cloud using containers and managing hosted Kubernetes environments in Microsoft Azure.

Only partners that meet stringent criteria around customer success and staff skilling, as well as pass a third-party audit of their container-based workload deployment and management practices, are able to earn the Kubernetes on Azure advanced specialization.

With over 75% of global organizations expected to run containerized applications in production by 2022, many are looking for a partner with advanced skills to migrate their existing containerized workloads to the cloud, or assist them in developing cloud-native applications using container technologies, DevOps patterns, and a microservices approach.

“With our deep expertise in cloud-native architecture design, we help clients build and run scalable applications with improved security, faster release cycles, easier management, and lower costs”, said Mr. Prem Kandalu, CEO. “As a partner who has earned the Kubernetes on Microsoft Azure advanced specialization, CloudIQ will pass on the benefits of our continued collaboration with Microsoft to our clients.”

Rodney Clark, Corporate Vice President, Global Partner Solutions, Channel Sales and Channel Chief at Microsoft added, “The Kubernetes on Microsoft Azure advanced specialization highlights the partners who can be viewed as most capable when it comes to deploying and managing containerized applications in Azure. CloudIQ Technologies clearly demonstrated that they have both the skills and the experience to deliver best-in-class cloud-native capabilities to customers with Azure.

About CloudIQ Technologies

CloudIQ is a leading cloud consulting and solutions firm that helps businesses envision new products, create innovative business models, and deliver the next level of customer experiences by leveraging the cloud. From standalone cloud projects to enterprise-wide cloud architecture design you can rely on our cloud engineering expertise.

As a Microsoft Gold Partner (Cloud Platform), earner of the Modernization of Web Applications on Microsoft Azure and Kubernetes on Microsoft Azure advanced specializations, Kubernetes Certified Service Provider (KCSP) and Kubernetes Training Partner (KTP), we serve as trusted advisors to Fortune 500 organizations and leverage our deep industry expertise in building cloud-native solutions to help our clients realize the cost, scale and security benefits of the cloud.

Without artificial intelligence (AI), organizing and extracting insights from vast amounts of enterprise data would be a nearly impossible task. Choosing the right AI capabilities is essential to successful initiatives. This infographic presents the four guiding principles behind Microsoft #Azure #AI and why it remains the top choice for today’s leading enterprises.

Would you like to leverage Azure AI for your business,? At CloudIQ Technologies, we have knowledgeable and professional team ready to help you transform massive amounts of raw information into meaningful insights for your business. Contact us today to learn more.

Have you been looking for a fully-managed, secure platform for your web apps? Azure App Service is built to help you build, deploy, and scale your web apps and APIs on your terms. Work with .NET, .NET Core, Node.js, Java, Python or php, in containers or running on Windows or Linux. Check out this infographic and contact CloudIQ Technologies to learn more.

Would you like to modernize your apps using Azure App Service? At CloudIQ Technologies, we have knowledgeable and professional team ready to help you. Contact us today to learn more.

Azure SQL Databases are intelligent and always up to date. It is the only cloud with evergreen SQL which never needs to be patched or updated. This infographic presents the benefits of Azure SQL Database and Azure Advance Threat Protection.

Would you like to migrate SQL Server databases to the Azure cloud,? At CloudIQ Technologies, we have knowledgeable and professional team ready to help address any of your IT infrastructure upgrade needs. Contact us today to learn more.

Moving Windows Server and SQL workloads to Azure provides flexible, scalable, and highly available cloud infrastructure. It also supports rapid innovation and digital transformation, freeing you to focus on your mission. This infographic presents the benefits of running Windows Server and SQL Server on Azure. 

Would you like to move your Windows Server and SQL workloads to Azure? At CloudIQ Technologies, we have knowledgeable and professional team ready to help address any of your IT infrastructure upgrade needs. Contact us today to learn more.

There are four different ways of accessing Azure Data Lake Storage Gen2 in Databricks. However, using the ADLS Gen2 storage account access key directly is the most straightforward option. Before we dive into the actual steps, here is a quick overview of the entire process

  • Understand the features of Azure Data Lake Storage (ADLS)
  • Create ADLS Gen 2 using Azure Portal
  • Use Microsoft Azure Storage Explorer
  • Create Databricks Workspace
  • Integrate ADLS with Databricks
  • Load Data into a Spark DataFrame from the Data Lake
  • Create a Table on Top of the Data in the Data Lake

Microsoft Azure Data Lake Storage (ADLS) is a fully managed, elastic, scalable, and secure file system that supports HDFS semantics and works with the Apache Hadoop ecosystem.  It is built for running large-scale analytics systems that require large computing capacity to process and analyze large amounts of data

Features:

Limitless storage

ADLS is suitable for storing all types of data coming from different sources like devices, applications, and much more. It also allows users to store relational and non-relational data. Additionally, it doesn’t require a schema to be defined before data is loaded into the store. ADLS can store virtually any size of data, and any number of files. Each ADLS file is sliced into blocks and these blocks are distributed across multiple data nodes. There is no limitation on the number of blocks and data nodes.

Auditing

ADLS creates audit logs for all operations performed in it.

Access Control

ADLS provides access control through the support of access control lists (ACL) on files and folders stored in its infrastructure. It also manages authentication through the integration of AAD based on OAuth tokens from supported identity providers.

Create ADLS Gen2 using Portal:

  1. Login into the portal.
  2. Search for “Storage Account”
  3. Click “Add”

4. Choose Subscription and Resource Group.

5. Give storage account name, location, kind, and replication.

6. In the Advanced Tab, set Hierarchical namespace to Enabled

7. Click “Review+Create”

Microsoft Azure Storage Explorer

Microsoft Azure Storage Explorer is a standalone app that makes it easy to work with Azure Storage data on Windows, macOS, and Linux.  Microsoft has also provided this functionality within the Azure portal which is currently in preview mode.1.

  1. Navigate back to your data lake resource in Azure and click ‘Storage Explorer (preview)’.

2. Right-click on ‘CONTAINERS’ and click ‘Create file system’. This will be the root path for our data lake.

3. Name the file system and click ‘OK’.

4. Now, click on the file system you just created and click ‘New Folder’. This is how we will create our base data lake zones. Create folders.

5. To upload data to the data lake, you will need to install Azure Data Lake explorer using the following link.

6. Once you install the program, click ‘Add an account’ in the top left-hand corner, log in with your Azure credentials, keep your subscriptions selected, and click ‘Apply’.

7. Navigate down the tree in the explorer panel on the left-hand side until you get to the file system you created, double click on it. Then navigate into the folder. There you can upload/ download files from your local system.

8. Click “Upload” > “Upload Files”. You can get sample data set from here.

Sample Folder structure:

Create Databricks Workspace

  1. On the Azure home screen, click ‘Create a Resource’

2. In the ‘Search the Marketplace’ search bar, type ‘Databricks’ and you should see ‘Azure Databricks’ pop up as an option. Click that option.

3. Click ‘Create’ to begin creating your workspace.

4. Use the same resource group you created or selected earlier. Then, enter a workspace name.

5. Select ‘Review and Create’.

6. Once the deployment is complete, click ‘Go to resource’ and then click ‘Launch Workspace’ to get into the Databricks workspace.

Integrate ADLS with Databricks:

There are four ways of accessing Azure Data Lake Storage Gen2 in Databricks:

  1. Mount an Azure Data Lake Storage Gen2 filesystem to DBFS using a service principal and OAuth 2.0.
  2. Use a service principal directly.
  3. Use the Azure Data Lake Storage Gen2 storage account access key directly.
  4. Pass your Azure Active Directory credentials, also known as a credential passthrough.

Let’s use option 3.

1. This option is the most straightforward and requires you to run the command, setting the data lake context at the start of every notebook session. Databricks Secrets are used when setting all these configurations

2. To set the data lake context, create a new Python notebook, and paste the following code into the first cell:

spark.conf.set(
"fs.azure.account.key.<storage-account-name>.dfs.core.windows.net",
""
)

3. Replace ‘<storage-account-name>’ with your storage account name.

4. In between the double quotes on the third line, we will be pasting in an access key for the storage account that we grab from Azure

5. Navigate to your storage account in the Azure Portal and click on ‘Access keys’ under ‘Settings’.

6. Click the copy button, and paste the key1 Key in between the double quotes in your cell

7. Attach your notebook to the running cluster and execute the cell. If it worked, you should see the following:

8. If your cluster is shut down, or if you detach the notebook from a cluster, you will have to re-run this cell to access the data.

9. Copy the below command in a new cell, filling in your relevant details, and you should see a list containing the file you updated.

dbutils.fs.ls("abfss://<file-system-name>@<storage-account-
name>.dfs.core.windows.net/<directory-name>")

Load Data into a Spark DataFrame from the Data Lake

Towards the end of the Error! Reference source not found. section, we uploaded a sample CSV file into ADLS.  We will now see how we can read this CSV file from Spark.

We can get the file location from the dbutils.fs.ls command we ran earlier – see the full path as the output.

Run the command given below:

#set the data lake file location:
file_location = "abfss://[email protected]/raw/covid
19/johns-hopkins-covid-19-daily-dashboard-cases-by-states.csv"
 
#read in the data to dataframe df
df = spark.read.format("csv").option("inferSchema", "true").option("header",
"true").option("delimiter",",").load(file_location)
 
#display the dataframe
display(df)

Create a table on top of the data in the data lake

In the previous section, we loaded the data from a CSV file to a DataFrame so that it can be accessed using python spark API.  Now, we will create a Hive table in spark with data in an external location (ADLS), so that the data can be access using SQL instead of python code.

In a new cell, copy the following command:

%sql
CREATE DATABASE covid_researc

Next, create the table pointing to the proper location in the data lake.

%sql
CREATE TABLE IF NOT EXISTS covid_research.covid_data
USING CSV
LOCATION 'abfss://[email protected]/raw/covid
19/johns-hopkins-covid-19-daily-dashboard-cases-by-states.csv'

 You should see the table appear in the data tab on the left-hand navigation pane.

Run a select statement against the table.

%sql
CREATE TABLE IF NOT EXISTS covid_research.covid_data
USING CSV
LOCATION 'abfss://[email protected]/raw/covid1
9/johns-hopkins-covid-19-daily-dashboard-cases-by-states.csv'
OPTIONS (header "true", inferSchema "true")

That concludes our step-by-step guide on accessing Azure Data Lake Storage Gen2 in Databricks, using the ADLS Gen2 storage account access key directly.

Hope you found this guide useful, stay tuned for more.

The scalability, flexibility, cost-efficiency, and improved performance that comes with moving to the cloud is becoming too attractive for companies to ignore them. Many organizations have started moving to the cloud as a cost-effective option to manage their IT portfolio and avoid expensive Capex for the purchase of new servers and remove the complexities in managing on-premises architecture.

Irrespective of a company’s size, migrating to the cloud is certainly quite an undertaking. Fortunately, Microsoft has created a unique platform with a range of tools to help make the migration fast and smooth, while minimizing the risk and impact to your business.

Microsoft Azure is one of the leading cloud computing service providers that allows businesses to use cloud resources on a pay per use model, therefore you can pay for only what you need and how long you need it. Azure provides multiple options right from Infrastructure as a Service (IaaS) to Platform as a Service (PaaS) to Software as a Service (SaaS), so you can choose from a simple lift and shift approach to a more complex application modernization approach.

The migration journey begins with an Assessment of your current setup. There are tools like Azure Migrate that Azure provides for this purpose and additionally you can leverage other assessment tools from Azure migration partner ecosystem. Azure Migrate provides a centralized hub to assess and migrate to Azure on-premise servers, infrastructures, applications, and data. With Azure, you can also assess how your workloads will perform, plan, and implement your migration strategy accordingly.

5 Azure Migration Strategies

Here are five strategies that are adopted widely for migrating an application to Azure cloud.

1. Rehosting

Commonly known as “lift and shift”, this is an approach to migrate applications from an on-premise environment to the cloud with no changes to the underlying applications. This is the most popular migration approach as it allows quick migration with little risk of disruption by employing real-time replication during the transition process.

2. Refactoring

This is also known as “repacking”. It involves making small changes to the code and configuration of the application to ensure they are more compatible with the cloud so you can connect them easily to Azure-native infrastructure. This can improve the scalability and maximize the operational cost-efficiency of the platform.

3. Rearchitecting

Also known as “redesigning”, this strategy involves modifying or extending the code base of an application to optimize it to run on Azure. Rearchitecting is a time-consuming migration approach, but still, it offers infinite scalability.

4. Rebuilding

This strategy involves discarding the old application and rebuilding an application or workload from the ground up using the Azure Platform as a Service (PaaS). In this migration strategy, you manage the applications and services you develop, while Azure manages the platform and infrastructure required to run it.

5. Replacing

Under this approach, all the underlying infrastructure, middleware, application software, and application data are in the cloud and managed by Azure in Microsoft datacenters. This is used for greater efficiency and scalability.

3-Step Migration Process

Once you decide on your migration approach, the actual migration to the cloud is a 3-step process (Assess – Migrate – Optimize). Now before you get started with the migration there are a few preliminary considerations to ensure your cloud environment is ready to receive your workloads. You need to ensure that your virtual data center in the cloud contains the elements that are comparable to your on-premises environment. Building the virtual data center in the cloud is a streamlined process and it includes the following

1. Identity

To ensure authenticated access to users between your on-premises environment and workloads that you have migrated to the cloud, you need to invest in a built-in identity management solution. For this purpose, you can use the Azure Active Directory (Azure AD) or other similar solutions.

2. Storage

Migrating to the cloud requires a storage platform that meets the performance needs of your migrated workloads. You can choose from different storage types and configure exact storage requirements based on workloads to ensure security and reliability. You just need to enter a few details to get the right storage for your migration project.

3. Networking

Networking is the backbone of the data center. When migrating to the cloud, you need to keep the applications in the same subnets and IP address ranges to ensure a seamless migration.  You can create a virtual network to maintain the same performance and stability you had in the on-premise data center.

4. Connectivity

During migration, you’ll transfer a large amount of data to the cloud. So it would be wise to opt for a dedicated connectivity option to help with smooth data transfer and have the best user experience. For this purpose, you can use Azure ExpressRoute as it helps in a faster, private connection to Azure and ensures performance and security.

Now it’s time to begin your migration journey to the cloud.

Migration Phase 1: Assessment

Now that you have a better understanding of Azure and how it fits into your migration strategy, it’s time to assess your existing infrastructure. Here are four steps to do that

1. Identification of application and server dependencies

Begin with inventory and assessment of on-premises IT resources to identify opportunities to optimize the IT environment and prioritize which applications and workloads are ideal for migration. Determining your priorities and objectives early can help you have a seamless migration process.

2. Assessment of on-premises applications and servers

Your organization may run hundreds or thousands of servers and virtual machines. You need consolidated planning and a perfect tool to shift them to the cloud. Microsoft offers Azure Migrate service to provide automation for the assessment of on-premises workloads. Ultimately, the goal of this assessment phase is to collect server and application information, including configuration and usage.

3. Configuration analysis

Configuration analysis will help you understand which of your workloads can be migrated with no modifications, which ones require a few modifications, and which workloads are incompatible with the current installation. Essentially this step helps you ensure the proper functioning of the workloads on the cloud.

4. Cost planning

The final step of the assessment phase is to collect resource usage such as CPU, memory, and storage to forecast costs and expenditures. This helps in ascertaining the actual usage of your workload and ensure that your choice meets both performance and economic targets.

Migration Phase 2: Execute Migration

After you’ve completed discovery and assessment, now it’s time to prepare for the next step – migration.  The lift-and-shift method most often employed for server or VM migration is real-time replication, due to its flexibility and capability in staged migration.

1. Real-time Replication

This involves creating a copy of the workload in the cloud and allowing asynchronous replication to keep the copy and the workload in sync. Replication also lets groups of virtual machines be connected to the cloud. Real-time replication also allows the old workload to remain online and accessible during the migration to ensure zero disruptions.

2. Testing

Once the replication is complete, start your application or workloads using an isolated environment that mimics the cloud production environment. It lets you test the application without impacting the on-premise as well as cloud production systems. When you’re fully satisfied, it’s time to perform the final migration.

Migration Phase 3: Optimize

Once the migration phase is complete, you need to ensure a seamless transition of operating workloads in the cloud. This is what the optimize phase is all about.

1. Secure cloud resources

Know the security controls and the capabilities of the new cloud-based application, to ensure that the security measures are working, and responding properly. You should become familiar with the capabilities of the Azure Security Center like centralized policy management, continuous security assessment, actionable recommendations, and more.

2. Protecting Data

Ensure that the workloads and data are having a proper backup, disaster recovery, encryption, and other measures to protect your business from risks. Azure offers multiple mechanisms like Azure disk encryption, Azure Backup, and Azure Site Recovery to protect your data.

3. Monitoring Cloud Health

Azure offers many monitoring services to ensure you have full visibility into your current system status and get unique insights into your applications and infrastructure. The basic monitoring services include Azure Monitor, Service Health, and Azure Advisor. A few of the premium monitoring services include Application Insights, Azure Log Analytics, and Network Watcher.

There are many options and reasons for migrating workloads to Azure.

With this straightforward guide, migration to Azure wouldn’t be a complex task anymore. By having a proper plan and mapping out the key objectives, you can ensure a successful Azure migration.

Want to know the key benefits of using Windows Server and SQL Server with Microsoft Azure? Here are four of them right from reducing costs and streamlining IT resources; modernizing by migrating to a flexible, open cloud; innovating new apps or managing existing server apps with unlimited flexibility; and ensuring data protection, security, and business continuity.

Would you like to modernize digital processes to improve profitability and ensure data security? At CloudIQ Technologies, we have knowledgeable and professional team ready to help you. Contact us today to learn more.

Azure Databricks provides comprehensive end-to-end diagnostic logs of activities performed by Azure Databricks users, allowing your enterprise to monitor detailed Azure Databricks usage patterns.

In this article, we’re going to look at sending the logs of Azure Databricks workspace to log analytics workspace using diagnostics settings present in the Databricks workspace.

Here are the pre-requisites and steps to enable diagnostics setting for Azure Databricks

Pre-requisites:
  1. User with owner or contributor access where the Databricks workspace is deployed.
  2. Databricks workspace. The diagnostics logging for Azure Databricks service is available only for the Premium plan.
Steps:
  1. Login to Azure portal.
  2. Select the Databricks workspace.

3. Select the diagnostics settings.

4. Now click “+ Add Diagnostics Settings”.

5. Azure Databricks provides diagnostic logs for the following services:

  • DBFS
  • Clusters
  • Pools
  • Accounts
  • Jobs
  • Notebook
  • SSH
  • Workspace
  • Secrets
  • SQL Permissions

6. Here we are going to send the logs to the log analytics workspace.

7. Select all the logs you want and send them to log analytics. Here we’re sending cluster logs.

8. Click Save.

9. Allow some time to ingest the logs to log analytics workspace.

10. Now go to the log analytics workspace where the diagnostics are configured.

11. Select logs. Now using KQL we can query our data sent from the Databricks workspace.

12. The Databricks log tables are found under the LogManagement category.

Databricks Monitoring Dashboard

Here is the simple Databricks Monitoring dashboard we created for

  • Cluster availability
  • Failed job trend
  • Success vs failed job trend

We hope this article helps you set up the right configurations to sending the logs of Azure Databricks workspace to log analytics workspace and build the Databricks monitoring dashboard.

Did you know migrating to Microsoft Azure can reduce your data center footprint 73%? 

Take advantage of your current investments and IT skills in Microsoft technologies. Microsoft applications and Azure have been built to work better together with flexibility, high compatibility and hybrid capabilities. 

Check out this infographic to learn, how you can get unparalleled cost savings, easily plan migrations, avoid complexity of multi-vendor support, and modernize your applications in the cloud from the leader you already trust.

Migrate to Azure at your own pace with confidence and support from CloudIQ. At CloudIQ Technologies, we have knowledgeable and professional team ready to help you modernize workloads with Azure. Contact us today to learn more.

test beta

Break down the cloud journey with four stages of the process—starting with a pre-migration assessment and then looking at migration, post-migration, and optimization. Microsoft Azure has you covered with tools created specifically for you.

Would you like to modernize your apps and data on Azure? At CloudIQ Technologies, we have knowledgeable and professional team ready to help you. Contact us today to learn more.

It’s time to create and implement business and technology strategies powered by the cloud. Here is your complete game plan. Check the “Cloud Adoption Framework” infographic to plan your strategy to modernize and innovate. 

Would you like to migrate to the cloud? At CloudIQ Technologies, we have knowledgeable and professional team ready to help you successfully adopt Cloud. Contact us today to learn more.

In our previous blog on getting started with Azure Databricks, we looked at Databricks tables.  In this blog, we will look at a type of Databricks table called Delta table and best practices around storing data in Delta tables.

1. Delta Lake

Delta Lake is an open-source storage layer that brings reliability to data lakes. Delta Lake provides ACID transactions, scalable metadata handling, and unifies streaming and batch data processing. Delta Lake runs on top of your existing data lake and is fully compatible with Apache Spark APIs. Databricks Delta table is a table that has a Delta Lake as the data source similar to how we had a CSV file as a data source for the table in the previous blog.

2. Table which is not partitioned

When we create a delta table and insert records into it, Databricks loads the data into multiple small files.  You can see the multiple files created for the table “business.inventory” below

3. Partitioned table

Partitioning involves putting different rows into different tables.  E.g., if we have an address table with addresses in the US, the addresses might be stored in 50 different tables corresponding to the 50 states in the US.  A view with a union might be created over all of them to provide a complete view of all addresses.

Sample code to create a table partitioned by date column is given below:

CREATE TABLE events (
  date DATE,
  eventId STRING,
  eventType STRING,
  data STRING)
USING delta
PARTITIONED BY (date) 

The table “business.sales” given below is partitioned by InvoiceDate.  You can see that there is a folder created for each InvoiceDate and within the folders, there are multiple files that store the data for this table.

This partitioning will be useful when we have queries selecting records from this table with InvoiveDate in WHERE clause. 

E.g.:
SELECT SLSDA_ID, RcdType, DistId
FROM business.sales
WHERE InvoiceDate = ‘2013-01-01’

In total there are 40,545 files for this table which you can see from below screenshot

4. OPTIMIZE

SMALL FILE PROBLEM

Historical and new data is often written in very small files and directories.  This data may be spread across a data center or even across the world (that is, not co-located).  The result is that a query on this data may be very slow due to

  • network latency
  • volume of file metadata

The solution is to compact many small files into one larger file.

OPTIMIZE command invokes the bin-packing (Compaction) algorithm to coalesce small files into larger ones.  Small files are compacted together into new larger files up to 1GB.

You can see below that the OPTIMIZE command has removed the 40,545 files and instead of them added 2378 files.  Also, observe that after Optimization the size of the table has decreased from 1.49 GB to 1.08 GB

5. Optimize table which is not partitioned

Optimize will compact the small files for tables that are not partitioned too.

business.finance_transactions_silver table is not partitioned and is currently having 64 files with total size 858 MB

Running the Optimize command coalesces the 64 files to 1 file

Note that “partitionsOptimized” is 1 in this case.  Previously for the partitioned table “partitionsOptimized was 2509.  OPTIMIZE command coalesces the small files within a partition only.  If the table is not partitioned, the whole table is considered as a single xpartition.

6. ZORDER
  • Data Skipping is a performance optimization that aims at speeding up queries that contain filters (WHERE clauses).
    As new data is inserted into a Databricks Delta table, file-level min/max statistics are collected for all columns (including nested ones) of supported types. Then, when there’s a lookup query against the table, Databricks Delta first consults these statistics to determine which files can safely be skipped.  This is done automatically and no specific commands are required to be run for this.
  • Z-Ordering is a technique to co-locate related information in the same set of files.
    Z-Ordering maps multidimensional data to one dimension while preserving the locality of the data points.

Given a column that you want to perform ZORDER on, say OrderColumn, Delta

  • Takes existing parquet files within a partition.
  • Maps the rows within the parquet files according to OrderColumn using the Z-order curve algorithm.
  • In the case of only one column, the mapping above becomes a linear sort
  • Rewrites the sorted data into new parquet files.

Note: We cannot use the table partition column also as a ZORDER column.

Syntax for ZORDER is

OPTIMIZE tablename
ZORDER BY (OrderColumn) 
7. Best practices

a. PARTITION BY

  • Partition the table by a column which is used in the WHERE clause or ON clause (join).  The most commonly used partition column is the date.
  • Use columns with low cardinality.  If the cardinality of a column will be very high, do not use that column for partitioning. For example, if you partition by a column userId and if there can be 1M distinct user IDs, then that is a bad partitioning strategy.
  • Amount of data in each partition: You can partition by a column if you expect data in that partition to be at least 1 GB.  Partitioning is not required for smaller tables.
  • PARTITION BY is done on a single column only

b. OPTIMIZE

  • OPTIMIZE is required for all tables to which we write data continuously on a daily basis.
  • OPTIMIZE is not required for tables that have static data/reference data which are rarely updated.
  • There is a cost associated with OPTIMIZE (Running Optimize command for sales took 6.64 minutes).  We should run it more often (daily) if we want better end-user query performance.  We should run it less often if we want to optimize costs.

c. ZORDER BY

  • If we expect a column to be commonly used in query predicates and if that column has high cardinality (that is, a large number of distinct values), then use ZORDER BY.
  • We can specify multiple columns for ZORDER BY as a comma-separated list. However, the effectiveness of the locality drops with each additional column.
8. References:

Migrating your IT infrastructure to the cloud has a ton of benefits. Whether you’re looking to improve security and become GDPR compliant, cut your total cost of ownership, or promote teamwork and innovation by integrating AI capabilities, the cloud provides a solution to your IT problems.

Would you like to upgrade your IT infrastructure to the cloud? At CloudIQ Technologies, we have knowledgeable and professional team ready to help address any of your IT infrastructure upgrade needs. Contact us today to learn more.

Backing up on-premises resources to the cloud leverages the power and scale of the cloud to deliver high-availability with no maintenance or monitoring overhead. With Azure Backup service the benefits keep adding up, right from data security to centralized monitoring and management.

Azure Backup service uses the MARS agent to back up files, folders, and system state from on-premises machines and Azure VMs. The backups are stored in a Recovery Services vault.

In this article, we will look at how to back up on-premise files and folders using Microsoft Azure Recovery Services (MARS) agent.

There are two ways to run the MARS agent:

  • Directly on on-premises Windows machines.
  • On Azure VMs that run Windows side by side with the Azure VM backup extension.

Here is the step by step process.

Create a Recovery Services vault

1. Sign in to the Azure portal.
2. On the Recovery Services vaults dashboard, select “Add”.

3. The Recovery Services vault dialog box opens. Provide values of the Name, Subscription, Resource group, and Location.
4. Select “Create”.

Download the MARS agent

1. Download the MARS agent so that you can install it on the machines that you want to back up.
2. In the vault, select “Backup”.
3. Select On-premises for “Where is your workload running?”
4. Select Files and folders for “What do you want to back up?”
5. Select “Prepare Infrastructure”.

5. For “Prepare infrastructure”, under Install Recovery Services agent, download the MARS agent.
6. Select “Already downloaded or using the latest Recovery Services Agent”, and then download the vault credentials.
7. Select “Save”.

Install and register the agent

1. Run the MARSagentinstaller.exe file on the VM.
2. In the “MARS Agent Setup Wizard”, select “Installation Settings”.
3. Choose where to install the agent and choose a location for the cache. Select “Next”.

  • The cache is for storing data snapshots before sending them to recovery services vault.
  • The cache location should have free space equal to at least 5 percent of the size of the data you’ll back up.

4. For Proxy Configuration, specify how the agent that runs on the Windows machine will connect to the internet. Then select “Next”.

5. For Installation, review, and select “Install”.
6. After the agent is installed, select “Proceed to Registration”.

7. In Register Server Wizard > Vault Identification, browse to and select the credentials file. Then select “Next”.

8. On the “Encryption Setting” page, specify a passphrase(user-defined) which is used to encrypt and decrypt backups for the machine.

9. Save the passphrase in a secure location. It is needed while restoring a backup.
10. Select “Finish”.

Create a backup policy

Steps to create a backup policy:

1. Open the MARS agent console.
2. Under “Actions”, select “Schedule Backup”.

3. In the “Schedule Backup Wizard”, select “Getting started” and click “Next”.
4. Under “Select Items to Back up”, select “Add Items”.

5. Select items to back up, and select OK.

6. On the “Select Items to Back Up” page, select “Next”.
7. Specify when to take daily or weekly backups in the “Specify Backup Schedule” page and select “Next”.
8. It is possible to schedule up to three daily backups per day and can run weekly backups too.

9. On the “Select Retention Policy” page, specify how to store copies of your data. And select “Next”.
10. On the Confirmation page, review the information, and then select “Finish”.

11. After the wizard finishes creating the backup schedule, select “Close”.

We hope this step by step guide helps you back up on-premise files and folders using Microsoft Azure Recovery Services (MARS) agent.

Many businesses are struggling to find the talent and capacity to create and manage their machine learning models to actually unlock the insights within their data. Here is an infographic that shows how Microsoft Azure Machine Learning streamlines this process to make modeling accessible to all businesses. 

What’s holding your business back from using AI to turn your data into actionable insights? Azure Machine Learning makes AI more accessible to businesses of all sizes and experience levels by reducing cost and helping you create and manage your models. But you don’t have to go it alone. We can help you assess your business needs and adopt the right AI solution. Contact us today to learn how we can help transform your business with AI.

The distributed nature of cloud applications requires a messaging infrastructure that connects the components and services, ideally in a loosely coupled manner in order to maximize scalability. In this article let’s explore the asynchronous messaging options in Azure.

At an architectural level, a message is a datagram created by an entity (producer), to distribute information so that other entities (consumers) can be aware and act accordingly. The producer and the consumer can communicate directly or optionally through an intermediary entity (message broker).  

Messages can be classified into two main categories. If the producer expects an action from the consumer, that message is a command. If the message informs the consumer that an action has taken place, then the message is an event.

Commands

The producer sends a command with the intent that the consumer(s) will perform an operation within the scope of a business transaction.

A command is a high-value message and must be delivered at least once. If a command is lost, the entire business transaction might fail. Also, a command shouldn’t be processed more than once. Doing so might cause an erroneous transaction. A customer might get duplicate orders or billed twice.

Commands are often used to manage the workflow of a multistep business transaction. Depending on the business logic, the producer may expect the consumer to acknowledge the message and report the results of the operation. Based on that result, the producer may choose an appropriate course of action.

Events

An event is a type of message that a producer raises to announce facts.

The producer (known as the publisher in this context) has no expectations that the events will result in any action.

Interested consumer(s), can subscribe, listen for events, and take actions depending on their consumption scenario. Events can have multiple subscribers or no subscribers at all. Two different subscribers can react to an event with different actions and not be aware of one another.

The producer and consumer are loosely coupled and managed independently. The consumer isn’t expected to acknowledge the event back to the producer. A consumer that is no longer interested in the events, can unsubscribe. The consumer is removed from the pipeline without affecting the producer or the overall functionality of the system.

There are two categories of events:

  • The producer raises events to announce discrete facts. A common use case is event notification. For example, Azure Resource Manager raises events when it creates, modifies, or deletes resources. A subscriber of those events could be a Logic App that sends alert emails.
  • The producer raises related events in a sequence, or a stream of events, over a period of time. Typically, a stream is consumed for statistical evaluation. The evaluation can be done within a temporal window or as events arrive. Telemetry is a common use case, for example, health and load monitoring of a system. Another case is event streaming from IoT devices.

A common pattern for implementing event messaging is the Publisher-Subscriber pattern.

Role and benefits of a message broker

An intermediate message broker provides the functionality of moving messages from producer to consumer and can offer additional benefits.

Decoupling

A message broker decouples the producer from the consumer in the logic that generates and uses the messages, respectively. In a complex workflow, the broker can encourage business operations to be decoupled and help coordinate the workflow.

Load balancing

Producers may post a large number of messages that are serviced by many consumers. Use a message broker to distribute processing across servers and improve throughput. Consumers can run on different servers to spread the load. Consumers can be added dynamically to scale out the system when needed or removed otherwise.

Load leveling

The volume of messages generated by the producer or a group of producers can be variable. At times there might be a large volume causing spikes in messages. Instead of adding consumers to handle this work, a message broker can act as a buffer, and consumers gradually drain messages at their own pace without stressing the system.

Reliable messaging

A message broker helps ensure that messages aren’t lost even if communication fails between the producer and consumer. The producer can post messages to the message broker and the consumer can retrieve them when communication is re-established. The producer isn’t blocked unless it loses connectivity with the message broker.

Resilient messaging

A message broker can add resiliency to the consumers in your system. If a consumer fails while processing a message, another instance of the consumer can process that message. The reprocessing is possible because the message persists in the broker.

Technology choices for a message broker

Azure provides several message broker services, each with a range of features.

Azure Service Bus

Azure Service Bus queues are well suited for transferring commands from producers to consumers. Here are some considerations.

Pull model

A consumer of a Service Bus queue constantly polls Service Bus to check if new messages are available. The client SDKs and Azure Functions trigger for Service Bus abstract that model. When a new message is available, the consumer’s callback is invoked, and the message is sent to the consumer.

Guaranteed delivery

Service Bus allows a consumer to peek the queue and lock a message from other consumers.

It’s the responsibility of the consumer to report the processing status of the message. Only when the consumer marks the message as consumed, Service Bus removes the message from the queue. If a failure, timeout, or crash occurs, Service Bus unlocks the message so that other consumers can retrieve it. This way messages aren’t lost in the transfer.

Message ordering

If you want consumers to get the messages in the order they are sent, Service Bus queues guarantee first-in-first-out (FIFO) ordered delivery by using sessions. A session can have one or more messages.

Message persistence

Service bus queues support temporal decoupling. Even when a consumer isn’t available or unable to process the message, it remains in the queue.

Checkpoint long-running transactions

Business transactions can run for a long time. Each operation in the transaction can have multiple messages. Use checkpointing to coordinate the workflow and provide resiliency in case a transaction fails.

Hybrid solution

Service Bus bridges on-premises systems and cloud solutions. On-premises systems are often difficult to reach because of firewall restrictions. Both the producer and consumer (either can be on-premises or the cloud) can use the Service Bus queue endpoint as the pickup and drop off location for messages.

Topics and subscriptions

Service Bus supports the Publisher-Subscriber pattern through Service Bus topics and subscriptions.

Azure Event Grid

Azure Event Grid is recommended for discrete events. Event Grid follows the Publisher-Subscriber pattern. When event sources trigger events, they are published to Event grid topics. Consumers of those events create Event Grid subscriptions by specifying event types and an event handler that will process the events. If there are no subscribers, the events are discarded. Each event can have multiple subscriptions.

Push Model

Event Grid propagates messages to the subscribers in a push model. Suppose you have an event grid subscription with a webhook. When a new event arrives, Event Grid posts the event to the webhook endpoint.

Custom topics

Create custom Event Grid topics, if you want to send events from your application or an Azure service that isn’t integrated with Event Grid.

High throughput

Event Grid can route 10,000,000 events per second per region. The first 100,000 operations per month are free.

Resilient delivery

Even though successful delivery for events isn’t as crucial as commands, you might still want some guarantee depending on the type of event. Event Grid offers features that you can enable and customize, such as retry policies, expiration time, and dead lettering.

Azure Event Hubs

When working with an event stream, Azure Event Hubs is the recommended message broker. Essentially, it’s a large buffer that’s capable of receiving large volumes of data with low latency. The data received can be read quickly through concurrent operations. You can transform the data received by using any real-time analytics provider. Event Hubs also provides the capability to store events in a storage account.

Fast ingestion

Event Hubs are capable of ingesting millions of events per second. The events are only appended to the stream and are ordered by time.

Pull model

Like Event Grid, Event Hubs also offers Publisher-Subscriber capabilities. A key difference between Event Grid and Event Hubs is in the way event data is made available to the subscribers. Event Grid pushes the ingested data to the subscribers whereas Event Hub makes the data available in a pull model. As events are received, Event Hubs appends them to the stream. A subscriber manages its cursor and can move forward and back in the stream, select a time offset, and replay a sequence at its pace.

Partitioning

A partition is a portion of the event stream. The events are divided by using a partition key. For example, several IoT devices send device data to an event hub. The partition key is the device identifier. As events are ingested, Event Hubs move them to separate partitions. Within each partition, all events are ordered by time.

Event Hubs Capture

The Capture feature allows you to store the event stream to Azure Blob storage or Data Lake Storage. This way of storing events is reliable because even if the storage account isn’t available, Capture keeps your data for a period, and then writes to the storage after it’s available.

We hope this quick start guide helps you get stated on azure messaging and event driven architecture.

Azure Databricks lets you spin up clusters and build quickly in a fully managed Apache Spark environment with the global scale and availability of Azure. And of course, for any production-level solution, monitoring is a critical aspect.

Azure Databricks comes with robust monitoring capabilities for custom application metrics, streaming query events, and application log messages. It allows you to push this monitoring data to different logging services.

In this article, we will look at the setup required to send application logs and metrics from Microsoft Azure Databricks to a Log Analytics workspace.

Prerequisites
  1. Clone the repository mentioned below
    https://github.com/mspnp/spark-monitoring.git
  2. Azure Databricks workspace
  3. Azure Databricks CLI
    Databricks workspace personal access token is required to use the CLI
    You can also use the Databricks CLI from Azure Cloud Shell.
  4. Java IDEs with the following resources
    Java Development Kit (JDK) version 1.8
    Scala language SDK 2.11
    Apache Maven 3.5.4
Building the Azure Databricks monitoring library with Docker

After cloning repository please open the terminal in the respective path

Please run the command as follows
Windows :

docker run -it --rm -v %cd%/spark-monitoring:/spark-monitoring -v "%USERPROFILE%/.m2":/root/.m2 maven:3.6.1-jdk-8 /spark-monitoring/build.sh 

Linux:

chmod +x spark-monitoring/build.sh
docker run -it --rm -v `pwd`/spark-monitoring:/spark-monitoring -v "$HOME/.m2":/root/.m2 maven:3.6.1-jdk-8 /spark-monitoring/build.sh 
Configuring Databricks workspace

dbfs configure –token
It will ask for Databricks workspace URL and Token
Use the personal access token that was generated when setting up the prerequisites
You can get the URL from
Azure portal > Databricks service > Overview

 dbfs mkdirs dbfs:/databricks/spark-monitoring 

Open the file /src/spark-listeners/scripts/spark-monitoring.sh
Now add the Log Analytics  Workspace ID and Key

Use Databricks CLI to copy the modified script

dbfs cp <local path to spark-monitoring.sh> dbfs:/databricks/spark-monitoring/spark-monitoring.sh 

Use Databricks CLI to copy all JAR files generated

dbfs cp --overwrite --recursive <local path to target folder> dbfs:/databricks/spark-monitoring/ 
Create and configure the Azure Databricks cluster
  1. Navigate to your Azure Databricks workspace in the Azure Portal.
  2. On the home page, click on “new cluster”.
  3. Choose a name for your cluster and enter it in the text box titled “cluster name”.
  4. In the “Databricks Runtime Version” dropdown, select 5.0 or later (includes Apache Spark 2.4.0, Scala 2.11).

5 Under “Advanced Options”, click on the “Init Scripts” tab. Go to the last line under the
“Init Scripts section” and select “DBFS” under the “destination” dropdown. Enter
“dbfs:/databricks/spark-monitoring/spark-monitoring.sh” in the text box. Click the
“Add” button.

6 Click the “create cluster” button to create the cluster. Next, click on the “start” button to start the cluster.

Now you can run the jobs in the cluster and can get the logs in the Log Analytics workspace

We hope this article helps you set up the right configurations to send application logs and metrics from Azure Databricks to your Log Analytics workspace.

This infographic outlines a day in the life of a remote worker using Microsoft Teams to collaborate, create, and be more productive while working at home. See how this individual uses multiple features to stay connected with the team and work efficiently

The daily life of most workers has changed drastically as COVID19 has made employees move to home offices. But your team can still get work done. Microsoft Teams can make it possible. Contact us to enable MS teams for your organization.

Databricks is a web-based platform for working with Apache Spark, that provides automated cluster management and IPython-style notebooks.  To understand the basics of Apache Spark, refer to our earlier blog on how Apache Spark works

Databricks is currently available on Microsoft Azure and Amazon AWS.  In this blog, we will look at some of the components in Azure Databricks.

1.   Workspace

A Databricks Workspace is an environment for accessing all Databricks assets. The Workspace organizes objects (notebooks, libraries, and experiments) into folders, and provides access to data and computational resources such as clusters and jobs.

Create a Databricks workspace

The first step to using Azure Databricks is to create and deploy a Databricks workspace. You can do this in the Azure portal.

  1. In the Azure portal, select Create a resource > Analytics > Azure Databricks.
  2. Under Azure Databricks Service, provide the values to create a Databricks workspace.

    a. Workspace Name: Provide a name for your workspace.
    b. Subscription: Choose the Azure subscription in which to deploy the workspace.
    c. Resource Group: Choose the Azure resource group to be used.
    d. Location: Select the Azure location near you for deployment.
    e. Pricing Tier: Standard or Premium

Once the Azure Databricks service is created, you will get the screen given below.  Clicking on the Launch Workspace button will open the workspace in a new tab of the browser.

2.   Cluster

A Databricks cluster is a set of computation resources and configurations on which we can run data engineering, data science, and data analytics workloads, such as production ETL pipelines, streaming analytics, ad-hoc analytics, and machine learning.

To create a new cluster:

  1. Select Clusters from the left-hand menu of Databricks’ workspace.
  2. Select Create Cluster to add a new cluster.

We can select the Scala and Spark versions by selecting the appropriate Databricks Runtime Version while creating the cluster.

3.   Notebooks

A notebook is a web-based interface to a document that contains runnable code, visualizations, and narrative text.  We can create a new notebook using either the “Create a Blank Notebook” link in the Workspace (or) by selecting a folder in the workspace and then using the Create >> Notebook menu option.

While creating the notebook, we must select a cluster to which the notebook is to be attached and also select a programming language for the notebook – Python, Scala, SQL, and R are the languages supported in Databricks notebooks.

The workspace menu also provides us the option to import a notebook, by uploading a file (or) specifying a file.  This is helpful if we want to import (Python / Scala) code developed in another IDE (or) if we must import code from an online source control system like git.

In the below notebook we have python code executed in cells Cmd 2 and Cmd 3; a python spark code executed in Cmd 4.  The first cell (Cmd 1) is a Markdown cell.  It displays text which has been formatted using markdown language.

Magic commands

Even though the above notebook was created with Language as python, each cell can have code in a different language using a magic command at the beginning of the cell.  The markdown cell above has the code below where %md is the magic command:

%md Sample Databricks Notebook 

The following provides the list of supported magic commands:

  • %python – Allows us to execute Python code in the cell.
  • %r – Allows us to execute R code in the cell.
  • %scala – Allows us to execute Scala code in the cell.
  • %sql – Allows us to execute SQL statements in the cell.
  • %sh – Allows us to execute Bash Shell commands and code in the cell.
  • %fs – Allows us to execute Databricks Filesystem commands in the cell.
  • %md – Allows us to render Markdown syntax as formatted content in the cell.
  • %run – Allows us to run another notebook from a cell in the current notebook.

4.   Libraries

To make third-party or locally built code available (like .jar files) to notebooks and jobs running on our clusters, we can install a library. Libraries can be written in Python, Java, Scala, and R. We can upload Java, Scala, and Python libraries and point to external packages in PyPI, or Maven.

To install a library on a cluster, select the cluster going through the Clusters option in the left-side menu and then go to the Libraries tab.

Clicking on the “Install New” option provides us with all the options available for installing a library.  We can install the library either uploading it as a Jar file or getting it from a file in DBFS (Data Bricks File System).  We can also instruct Databricks to pull the library from Maven or PyPI repository by providing the coordinates.

5. Jobs

During code development, notebooks are run interactively in the notebook UI.  A job is another way of running a notebook or JAR either immediately or on a scheduled basis.

We can create a job by selecting Jobs from the left-side menu and then provide the name of job, notebook to be run, schedule of the job (daily, hourly, etc.)

Once the jobs are scheduled, the jobs can be monitored using the same Jobs menu.

6.   Databases and tables

A Databricks database is a collection of tables. A Databricks table is a collection of structured data. Tables are equivalent to Apache Spark DataFrames. We can cache, filter, and perform any operations supported by DataFrames on tables. You can query tables with Spark APIs and Spark SQL.

Databricks provides us the option to create new Tables by uploading CSV files; Databricks can even infer the data type of the columns in the CSV file.

All the databases and tables created either by uploading files (or) through Spark programs can be viewed using the Data menu option in Databricks workspace and these tables can be queried using SQL notebooks.

We hope this article helps you getting started with Azure Databricks. You can now spin up clusters and build quickly in a fully managed Apache Spark environment with the global scale and availability of Azure.

To keep business critical applications running 24/7/365 it is important for organizations to have a sound business continuity and disaster recovery strategy. In this article we will discuss how to set up disaster recovery for Azure VM in secondary region.

We will use Azure Site Recovery that helps manage and orchestrate disaster recovery of on-premises machines and Azure virtual machines (VM), including replication, failover, and recovery.

Pre-requisites,
  1. Recovery services vault
  2. A Virtual machine

ENABLING REPLICATION

To replicate a VM to secondary region, prepare site recovery infrastructure. In this case we are replicating from one azure region to another.

For replication from Azure to Azure, one can directly go to recovery service vault and replicate the VM.

  1. Go to recovery services vault >> Replicated Items
  2. Click on the Replicate icon and follow.

3. Select the following

  • Source: Azure
  • Source location: Region where the VM is deployed
  • Azure virtual machine deployment model: Resource manager
  • Source subscription: Subscription where the VM is deployed
  • Source Resource group: Resource group where the VM is deployed

4. Select OK to proceed to next step.

5.Select the VM to replicate

6. Target configurations

  • Target location: Secondary region where the VM is to be replicated
  • Target subscription: Subscription where the VM to be replicated

By default, the following resources are created in target region and can be customized per your need

  • Resource group
  • Virtual network
  • Cache storage account
  • Replica managed disks
  • Target availability sets (if applicable)

You can set replication policies and view extension details here

7.Click on create target resources

8.Then select Enable replication

Go to recovery services vault >> Monitoring >> Site recovery Jobs to view the jobs that are running during the replication.

Look for the following jobs

  1. Prerequisites check for enabling protection
  2. Installing Mobility Service and preparing target
  3. Enable replication
  4. Starting initial replication
  5. Updating the provider states

Once the above-mentioned jobs are over, here is what happens

  • Synchronization process begins
  • Waiting for first recovery point
  • The VM is protected

Once all processes are completed, you can view the VM by going to recovery services vault >> Replicated Items

Introduction to Terraform

Terraform is an open-source tool for managing cloud infrastructure. Terraform uses Infrastructure as Code (IaC) for building, changing and versioning infrastructure safely. Terraform is used to create, manage, and update infrastructure resources such as virtual machines, virtual networks, and clusters.

The Terraform CLI provides a simple mechanism to deploy and version the configuration files to Azure. And with AzureRM you can create, modify and delete azure resources in Terraform configuration.

The infrastructure that Terraform can manage, includes low-level components such as compute instances, storage, and networking, as well as high-level components such as DNS entries, SaaS features, etc.

Providers in Terraform

A provider is responsible for understanding API interactions and exposing resources. Providers generally are an IaaS

  • Azure
  • Aws
  • Google Cloud
  • OpenStack
  • Docker
  • Alibaba Cloud
  • VMware   

For each provider, there are many kinds of resourcesyou can create. Here is the general Syntax for terraform resources.

resource  “<provider>_<type>”   “<name>” 	{
[config]
}

Where PROVIDER is the name of a provider (e.g., Azure), TYPE is the type of resources to create in that provider (e.g., Instance), NAME is an identifier you can use throughout the Terraform code to refer to this resource and CONFIG consists of one or more argumentsthat are specific to that resource.

Terraform Features:

Infrastructure as Code

Infrastructure is described using a high-level configuration syntax. This allows a blueprint of your datacenter to be versioned and treated as you would any other code. Additionally, infrastructure can be shared and re-used.

Execution Plans

Terraform has a “planning” step where it generates an execution plan. The execution plan shows what Terraform will do when you call apply. This lets you avoid any surprises when Terraform manipulates infrastructure.

Resource Graph

Terraform builds a graph of all your resources and parallelizes the creation and modification of any non-dependent resources. Because of this, Terraform builds infrastructure as efficiently as possible, and operators get insight into dependencies in their infrastructure.

Change Automation

Complex changesets can be applied to your infrastructure with minimal human interaction. With the previously mentioned execution plan and resource graph, you know exactly what Terraform will change and in what order, avoiding many possible human errors.

TERRAFORM STRUCTURE

The primary module structure requirement is that a “root module” must exist. The root module is the directory that holds the Terraform configuration files that are applied to build your desired infrastructure. Any module should include, at a minimum, a “main.tf”, a “variables.tf” and “outputs.tf” file.

main.tf calls modules, locals, and data-sources to create all resources. If using nested modules to split up your infrastructure’s required resources, the “main.tf” file holds all your module blocks and any needed resources not contained within your nested modules.

variables.tf contains the input variable and output variable declarations.

outputs.tf tells Terraform what data is important. This data is outputted when “apply” is called and can be queried using the Terraform “output” command. It contains outputs from the resources created in main.tf.

TFVARS File – To persist variable values, create a file and assign variables within this file. Within the current directory, for all files that match terraform.tfvars or *.auto.tfvars, terraform automatically loads them to populate variables.

MODULES

Modules are subdirectories with self-contained Terraform code. A module is a container for multiple resources that are used together. The root module is the directory that holds the Terraform configuration files that are applied to build your desired infrastructure. The root module may call other modules and connect them by passing output values from one as input values of another.

In production, we may need to manage multiple environments, and different products with similar infrastructure. Writing code to manage each of these similar configurations increases redundancy in the code.  And finally, we need the capability to test different versions while keeping the production infrastructure stable.

Terraform provides modules that allow us to abstract away re-usable parts, which can be configured once, and used everywhere. Modules allow us to group resources, define input variables which are used to change resource configuration parameters and define output variables that other resources or modules can use.

Modules can also call other modules using a “module” block, but we recommend keeping the module tree relatively flat and using module composition as an alternative to a deeply nested tree of modules, because this makes the individual modules easier to re-use in different combinations.

Terraform Workflow :

There are steps to build infrastructure with terraform

  • INIT
  • Plan
  • Apply
  • Destroy
INIT

Initialize the Terraform configuration directory using Terraform “init”.

Init will create a hidden directory “.terraform” and download plugins as needed by the configuration. Init also configures the “-backend-config” option and can be used for partial backend configuration.

Command

terraform init -backend-config=”backend-dev.config”

backend-dev.config – This file contains the details shown in the screenshot below.

PLAN

The terraform plan command is used to create an execution plan. The plan will be used to see all the resources that are getting created/updated/deleted, before getting applied. Actual creation will happen in the “apply” command.

The var file given will define resources that are unique for each team.

Command:

terraform plan -var-file="parentvarvalues.tfvars"

This file includes all global variables and Azure subscription details.

APPLY

The Terraform “apply” command is used to apply changes in the configuration. You’ll notice that the “apply” command shows you the same “plan” output and asks you to confirm if you want to proceed with this plan.

The “-auto-approve” parameter will skip the confirmation for creating resources. It’s better not to have it when you want to apply directly, without “plan”.

Command:

terraform apply -var-file="parentvarvalues-team1.tfvars" -auto-approve

Terraform State Management:

Terraform stores the resources it manages into a state file. There are two types of state files: “remote” and “local”. While the “local” state is great for an isolated developer, the “remote” state is quite indispensable for a team, as each member will need to share the infrastructure state whenever there is a change.

Terraform compares those changes with the state file to determine what changes result in a new resource or resource modifications. Terraform stores the state about our managed infrastructure and configuration. This state is used by Terraform to map real-world resources to our configuration, keep track of metadata, and to improve performance for large infrastructures.

TERRAFORM IMPORT

The terraform import command is used to import existing infrastructure. This allows you to take resources you’ve created by some other means and bring it under Terraform management. This is a great way to slowly transition infrastructure to Terraform.

resource “azurerm_resourcegroup .name <subscription_id>{
#instance configuration
}

You want to import the state that already exists, so that the next time you the “apply” command, terraform already knows that the resource exists, and any changes made going forward will be picked up as modifications.

79% of analytics users encounter data questions they can’t solve each month. How are you helping your team to uncover insights from data?

Moving data isn’t always easy. There are more than 340 types of databases in use today and moving data across them presents challenges for any IT team. At CloudIQ Technologies, we have years of experience helping businesses find the IT solutions that can keep up with their constantly evolving business practice. Whether you need a solution for storage, data transfer, or just need to gain better insights from your data, we can help.

KEXP is known internationally for their music and authenticity. To help bring their global audience the music they want, KEXP needed to find a solution that could bring all of their services online. While they eventually accomplished their goal, they did run into roadblocks along the way.

Your partners at CloudIQ Technologies and Microsoft can help you overcome any obstacle. With years of industry experience, you can be rest assured knowing your custom IT solution will be up and running in no time. Contact us today to find out more on how we can help.

Cloud security is a major challenge for organizations running mission critical applications on cloud. One of the biggest risks from hackers come via open ports, and Microsoft Azure Security Center provides a great option to manage this threat – Just-in-Time VM access!

What is Just-in-Time VM access?

With Just-in-Time VM access, you can define what VM and what ports can be opened and controlled and for how long. The Just-in-Time access locks down and limits the ports of Azure virtual machines in order to overcome malicious attacks on the virtual machine, therefore only providing access to a port for a limited amount of time. Basically, you block all inbound traffic at the network level.

When Just-In-Time access is enabled, every user’s request for access will be routed through Azure RBAC, and access will be granted only to users with the right credentials. Once a request is approved, the Security Center automatically configures the NSGs to allow inbound traffic to these ports – only for the requested amount of time, after which it restores the NSGs to their previous states.

The just-in-time option is available only for the standard security center tier and is only applicable for VMs deployed via Azure resource manager.

What are the permissions needed to configure and use JIT?
To enable a user to:Permissions to set
Configure or edit a JIT policy for a VMAssign these actions to the role:
On the scope of a subscription or resource group that
is associated with the VM: Microsoft.Security/locations/jitNetworkAccessPolicies/write
On the scope of a subscription or resource group of VM: Microsoft.Compute/virtualMachines/write
Request JIT access to a VMAssign these actions to the user:
On the scope of a subscription or resource group that is associated with the VM:
Microsoft.Security/locations/jitNetworkAccessPolicies/initiate/action
On the scope of a subscription or resource group that is associated with the VM:
Microsoft.Security/locations/jitNetworkAccessPolicies/*/read
On the scope of a subscription or resource group or VM: Microsoft.Compute/virtualMachines/read
On the scope of a subscription or resource group or VM:
Microsoft.Network/networkInterfaces/*/read

Why Just-in-Time access?

Consider the scenario where a virtual machine is deployed in Azure, and the management port is opened for all IP addresses all the time. This leaves the VM open for brute force attack.

The brute force attack is usually targeted to Management ports like SSH (22) and RDP (3389). If the attacker compromises the security, the whole VM will be open to them. Even though we might have NSG firewalls enabled in our Azure infrastructure, it’s best to limit the exposure of management port within the team for a limited amount of time.

How to enable JIT?

Just-In-Time access can be implemented in two ways,

  1. Go to Azure security center and click on Just-in-Time VM

2. Go to the VM, then click on configuration and “Enable JIT”

How to set up port restrictions?

Go to Azure security center and click on recommendations for Compute &Apps.

Select the VM and click Enable JIT on VMs.

It will then show a list of recommended ports. It is possible to add additional ports as per requirement. The default port list is show below.

Now click on the port that you wish to restrict. A new tab will appear with information on the protocol to be allowed, allowed source IP (per IP address, or a CIDR range).

The main thing to note is the request time. The default time is 3 hours; it can be increased or decreased as per the requirement. Then click, OK.

Click OK and the VM will appear in the Just-in-Time access window in the security center.

What changes will happen in the infrastructure when JIT is enabled?

The Azure security center will create a new Deny rule with a priority less than the original Management port’s Allow rule in the Network security group’s Inbound security rule.

If the VM is behind an Azure firewall, the same rule overwrite occurs in the Azure firewall as well.

How to connect to the JIT enabled VM?

Go to Azure security center and navigate to Just-in-Time access.

Select the VM that you need to access and click on “Request Access”

This will take you to the next page where extra details need to be provided for connectivity such as,

  1. Click ON Toggle
  2. Provide Allowed IP ranges
  3. Select time range
  4. Provide a justification for VM Access
  5. Click on Open Ports

This process will overwrite the NSG Deny rule and create a new Allow rule with less priority than the Deny All inbound rule or the selected port.

The above-mentioned connectivity process includes two things

  1. IP Range
  2. My IP options

IP Range:

In this option, we can provide either a single IP or a CIDR block.

MY IP:

Case 1: If you’re connected to Azure via a public network, i.e., without any IP sec tunnel, while selecting MY IP, the IP address that is to be registered will be the Public IP address of the device you’re connecting from.

Case 2: If you’re connected to Azure via a VPN/ IP Sec tunnel/ VNET Gateway, you can’t possibly use MY IP option. Since the MY IP option directly captures the Public IP and it can’t be used. In this scenario, we need to provide the private IP address of VPN gateway for a single user, or to allow a group of users, provide private IP CIDR Block of the whole organization.

How to monitor who’s requested the access?

The users who request access are registered as Activity. To view the activities in Log analytics workspace, link the Subscription Activity to Log analytics.

It is possible to view the list of users accessed in the log analytics workspace with the help of Kusto Query Language(KQL), once it is configured to send an activity log to log analytics.

In September 2019, Azure announced a brand-new service – Azure Private Link, a very important tool for service providers providing a mix of Azure IaaS and PaaS services.

What does Azure Private Link do?

Azure Private Link enables you to access Azure PaaS Services (for example, Azure Storage and SQL Database) and Azure-hosted customer-owned/partner services over a Private Endpoint in your virtual network. Traffic between your virtual network and the service traverses over the Microsoft backbone network, eliminating exposure from the public Internet. It can be used via a local IP address (on Azure and from on-premises networks) or via a dedicated Azure ExpressRoute network.

Importance of Azure Private Link

Well, naturally, the first benefit is security!  It reduces the exposure of PaaS services to the Internet and provides a secure way to manage traffic between the client’s network and Azure. With Private Link Service, data stays within Microsoft’s system and the client’s private network.

For service providers and their clients, this is obviously critical as it provides secure access to customers in their virtual network while giving them the ability to use the resources in the service provider’s subscription.

Find out how a Private Link Service can be created behind a standard load balancer.

In the example below, Kubernetes Ingress Service is exposed as a Private Link Service. The ingress has a Standard Load Balancer with IP Address 172.17.1.100.

Details of Ingress Service (Internal Load Balancer) 

[email protected]:~$ kubectl get service -A | grep  LoadBalancer
dev                ciq-demo-ingress-nginx-ingress-
controller        LoadBalancer   192.168.3.11    172.17.1.100     80:32314/TCP,443:30694/TCP   43h

Service can be accessed as below from within the VNET(ciq-demo-vnet)

http://172.17.1.100/web/api/imageresult
Added this method for testing this API in API-MGMT. The current time is : 02/20/2020 10:07:23

The private Link service is created with the following details.

  • Alias – It is a unique URI identifying the service and can be accessed from anywhere within Azure.
  • NAT IP – This determines the Source IP and Destination IP of incoming and outgoing packets to the Private Link service, respectively. This NAT IP can be within any subnet of the service provider VNET

Next, you create a private endpoint in the consumer vnet/subnet. In our example, we have created a network interface in the ciq-devops-general-rq-vnet/default vnet/subnet. The private ip within the vnet/subnet is 10.0.0.4. The Kubernetes ingress service can be accessed from the consumer vnet using the 10.0.0.4 private IP.

[email protected]:~$ curl http://10.0.0.4/web/api/imageresult
Added this method for testing this API in API-MGMT. The current time is : 02/20/2020 10:09:03

Private Link can be enabled for other Azure Resources, such as below.

For example, the private endpoint was enabled for a Storage account.

[email protected]:~$ curl http://k8sworkshopstg.blob.core.windows.net/test/hw.txt

Hello World!

[email protected]:~$ nslookup k8sworkshopstg.blob.core.windows.net
Server:         127.0.0.53
Address:        127.0.0.53#53

Non-authoritative answer:
k8sworkshopstg.blob.core.windows.net    canonical name = k8sworkshopstg.privatelink.blob.core.windows.net.
Name:   k8sworkshopstg.privatelink.blob.core.windows.net
Address: 10.0.0.5

[email protected]:~$ curl http://k8sworkshopstg.privatelink.blob.core.windows.net/test/hw.txt

Hello World!

Welcome to Cloud View!

This week we look at why 2020 will be the launch of the ‘Data Decade’, top CX trends, general availability of Azure Sphere, and troubleshooting common problems in Kubernetes Deployments.

Industry News & Perspectives

Launch of the “Data Decade”

CRN asked nearly 80 CEOs five questions about how digital technologies will shape up in 2020 and beyond. Here is a summary of what they said.

Top CX Trends

Customer experience is no longer just the responsibility of client-facing departments. As more customers shop online, CX has become a top priority for CIOs as well. Here are 5 technology trends that should be a part of every CIO’s strategy.


Technology Updates

Azure Sphere

From its inception in Microsoft Research to general availability today, Azure Sphere is Microsoft’s answer to these escalating IoT threats. An interview with Galen Hunt, distinguished engineer and product leader of Azure Sphere.


DevOps and Agility

As DevOps becomes mainstream and with a range of frameworks to choose from, is DevOps losing its agility? Maybe, maybe not! Here is an article that will reconnect you to the grounding principle of DevOps – innovation, and agility.


Microsoft Datacenters in Spain

Microsoft announces its strategy for establishing new European datacenters in Spain. While there is no fixed launch date, the company has announced that the proposed DCs will deliver Azure, Microsoft 365, Dynamics 365, and the Power Platform.


From CloudIQ

Optimizing Azure Cosmos DB Performance

Azure Cosmos DB allows Azure platform users to elastically and independently scale throughput and storage across any number of Azure regions worldwide. Here is an article on how to optimize Cosmos DB performance.

How to Debug and Troubleshoot Common Problems in Kubernetes Deployments

Kubernetes deployment issues are not always easy to troubleshoot. In some cases, the errors can be resolved easily, and, in some cases, detecting errors requires us to dig deeper and run various commands to identify and resolve the issues. Here is a guided tutorial to debug applications that are deployed into Kubernetes.

Welcome to Cloud View!

This week we look at Microsoft’s Azure strategy, Gartner’s getting smarter about digital business, DevSecOps, debugging Kubernetes applications and a deep dive into Kubernetes networking.

Industry Viewpoints & News

Smarter with Gartner

Kickstart the week with this incredibly ‘smart’ article by Smarter with Gartner. The article sounds a warning about sticking to the old ways of doing digital.

Microsoft’s Azure Strategy

Gavriella Schuster, Microsoft’s Channel Chief, sat down for an illuminating chat with CRN. The conversation revealed Microsoft’s Azure strategy, its channel investment priorities, and its upcoming plans for its partners.

Google Cloud acquires Cornerstone

Google Cloud just finalized another big acquisition with Cornerstone Technology, a mainframe specialist. The new purchase fits in perfectly with Google Cloud’s strategy to make the shifting of legacy applications on to the cloud easier.


Technical Insights 

DevSecOps

When paired together, Security and DevOps can offer organizations more robust and baked-in security. Find out how companies can do DevSecOps correctly in this two-part series by Devops.com.


Azure Firewall Manager

Microsoft extends Azure Firewall Manager preview to include automatic deployment and central security policy management for Azure Firewall in hub virtual networks.


Debugging a Kubernetes Application

General troubleshooting and debugging techniques for an application running in a Kubernetes environment and the most common issues to expect.


From CloudIQ

Kubernetes Networking Deep Dive – Data Plane, how it Works Under the Hood?

Kubernetes is simple enough to get started, however one of the most complex and critical part is the networking. Here is a deep dive into the Data Plane and how it works under the hood.

Microservices – Aligning business and technology for closer collaboration, agility & flexibility

A microservices-based architecture introduces agility, flexibility and supports a sustainable DEVOPS culture ensuring closer collaboration within businesses and the news is that it’s happening for those who embrace it.

Welcome to Cloud View!

This week we look at Oracle Cloud Data Science Platform, GKE support for Windows, DevSecOps and making better quality software using Jenkins CI/CD pipeline.

Industry Perspective and News

Oracle Cloud Data Science Platform

Oracle is pulling out all stops in 2020! The company is in the news once again with the launch of its data science platform – a first of its kind cloud-native platform that is completely geared towards providing a collaborative workspace for data scientists.

Windows on GKE

Google Cloud supports Windows on GKE, as a part its commitment to providing complete support to client’s Windows server-based applications.

Six I’s of Successful IT Leaders

As every company becomes a tech company, the role of the CIOs become critical for success. Naturally, this increases the pressure on the CIOs to deliver more and think strategically. Here is an article that outlines 6 key focus areas that CIOs can start with.


Technological Insights

Storage is getting a reboot!

IBM is the first one to revamp its storage lines with the launch of Storwize and Flash Systems A9000.


DevSecOps

Security as a topic is never far off in any IT discussion. A useful webinar by CEO of Cmd, Jake King, on DevOpsTV. He lays out 7 DevOps-friendly techniques that will help you incorporate security without compromising on speed or scale.


CockroachDB

A fantastic chat with Peter Mattis, the creator of the CockroachDB open-source database and co-founder and CTO of Cockroach Labs. A conversation that covers interesting bits from his career in open source and Google.


From CloudIQ

Make Better Quality Software using Jenkins for your CI/CD Pipeline

With Jenkins, organizations can accelerate the software development process through automation. Jenkins integrates development life-cycle processes of all kinds, including build, document, test, package, stage, deploy, static analysis and much more.

Blue Green Deployment on Azure, Safe Strategy with Zero Downtime

When you are deploying a new change into production, the associated deployment should be in a predictable manner. In simple terms, this means no disruption and zero downtime!

Welcome to Cloud View!

This week we look at the State of DevOps, Hyperledger Fabric on AKS Marketplace template, Serverless computing frameworks and InfluxDB on GCP.

Industry News & Perspectives

Robot Resource Organizations

The adoption of new digital technologies and the ever-changing expectations of customers continues to challenge traditional retailers, forcing them to investigate new-human hybrid operational models, including artificial intelligence (AI), automation and robotics.

Oracle Cloud

While the cloud market pie is divided into 3-4 large slices, there is still plenty of business in the thin sliver left for ‘others’. Oracle Cloud is doing its best to capture this small market and maybe become more than a niche infrastructure provider.

State of DevOps

The 8th annual ‘State of DevOps’ survey reports that the retail sector (as always) is the most advanced when it comes to DevOps adoption. Read on for more details.


Technical Insights

Hyperledger Fabric on AKS Marketplace template

Users with little knowledge of Azure or Hyperledger Fabric will now be able to easily set up a blockchain consortium on Azure with the new Hyperledger Fabric on Azure Kubernetes Service marketplace template.


Serverless computing frameworks

A new report by Datadog shows that almost 50% of the companies using its platform are opting for AWS Lambda serverless computing framework.


InfluxDB on GCP

Database provider InfluxDB is now live on GCP. It announced the availability of its managed cloud service as a part of Google Cloud’s open-source umbrella. Next on the agenda is the rollout of its second-generation serverless offering on Azure.


From CloudIQ

Develop Faster with Continuous Integration & the Tools to Get the Job Done

In recent years CI has become a best practice for software development and is guided by a set of key principles. Among them are revision control, build automation and automated testing.

Installing and Using HELM, the Package Manager for Kubernetes

Helm is a package manager for Kubernetes that allows developers and operators to more easily package, configure, and deploy applications and services onto Kubernetes clusters.

Welcome to Cloud View!

This week we look at Stackshare’s top 140 tools for developers in 2019, auto-labelling tool for AI developers, using Terraform for multi-cloud orchestration strategy and more.

Industry Perspective & News

Launchable aims to increase delivery velocity

This is the best use of any extra 25 minutes you have today. Jenkins founder Kohsuke Kawaguchi (KK) and respected DevOps veteran Harpreet Singh have launched a new company called Launchable that aims to increase delivery velocity. They talk all about it in this podcast (if you prefer to read, then the link has a complete transcript too).  

AWS maintains market share

January ended with the news of AWS posting revenues of $9.95 billion in Q4 2019 – bringing its 2019 revenue up to a grand total of $40bn. But, as we say, “cloud is complex,” and there is plenty of big fish in the cloud market.


Tech Insights

Top 140 tools for developers in 2019

Here is an article every developer MUST bookmark. Stackshareanalyzed over four million data points shared in their community and shortlisted the definitive list of Top 140 tools for developers in 2019!


Auto-labelling tool for AI developers

After that mammoth list, we decided to continue with the tool theme. So here is another one – a new auto-labeling tool for AI developers by IBM.


Cisco HyperFlex Application Platform (HXAP)

Cisco launched a tool (or rather a whole bouquet of tools) that lets customers build their own cloud-native environments. The HyperFlex Application Platform (HXAP) offers a whole host of integrated tools such as container networking, storage, a load balancer, and more.


From CloudIQ

Kubernetes on Azure: A 2-day workshop for AKS developers

Container technology has revolutionized the DevOps landscape and offers organizations the chance to develop and test applications faster and more cost-effectively. CloudIQ’s 2-day hands-on workshop is designed to give DevOps team members the opportunity to skill-up and learn Kubernetes design, deployment, and management.

Terraform for Multi-Cloud Orchestration Strategy

Terraform being cloud-agnostic, allows a single configuration to be used to manage multiple providers, and to even handle cross-cloud dependencies by simplifying management and orchestration.

There are many DevOps lifecycle tools out there, however GitLab is a complete package designed for coordinating CI/CD pipelines.

GitLab is a web-based DevOps lifecycle tool. This application offers functionality to automate the entire DevOps life cycle from planning to creation, build, verify, security testing, deploying, and monitoring, offering high availability and replication. It is highly scalable and can be used on-prem or on the cloud. GitLab also includes a wiki, issue-tracking, and CI/CD pipeline features.

When DevOps projects are spread across large, geographically dispersed teams a complete DevOps tool is highly useful to maintain collaboration, incorporate feedback, avoid mistakes, and speed up the development process.

GitLab goes beyond being just a repository manager; it has a built-in CI/CD, which saves enormous amounts of time and keeps the workflow smooth. Along with its own CI/CD, GitLab also allows for a range of 3rd party integrations with external CI, so you always have the option of working with the tools based on your workflow. 

Here is a quick run-through on how to start with GitLab.

Project:

In GitLab, we can create projects for hosting codebase, use it as an issue tracker, collaborate on code, and continuously build, test, and deploy apps with built-in GitLab CI/CD. Projects can be available publicly, internally, or privately, at our choice. GitLab does not limit the number of private projects we create.

Create a project in GitLab

In the dashboard, click the green “New project” button or use the plus icon in the navigation bar. This opens the New Project page.

On the New Project page :

  • Create a Blank project
  • Fill the name of your project in the Project name field
  • Project URL Field which is the URL path for the project that the GitLab instance will use
  • Project slug field will be auto-populated
  • The Project description (optional) field enables you to enter a description for the project’s dashboard
  • Changing the Visibility Level modifies the project’s viewing and access rights for users
  • Selecting the Initialize repository with a README option creates a README file so that the Git repository is initialized, has a default branch, and can be cloned
  • Click Create project

Repository

A repository is a part of a project, which has a lot of other features.

Host your codebase in GitLab repositories by pushing files to GitLab. You can either use the user interface (UI) or connect your local computer with GitLab through the command line.

GitLab Basic Commands: https://docs.gitlab.com/ee/gitlab-basics/command-line-commands.html

Branch

When you create a new project, GitLab sets the master as the default branch for your project. You can choose another branch to be your project’s default under your project’s Settings > Repository.

Commits

When you commit your changes, you are introducing those changes to your branch. Via a command line, you can commit multiple times before pushing.

A commit message is important to identify what is being changed and, more importantly, why. In GitLab, you can add keywords to the commit message that will perform one of the actions below:

  • Trigger a GitLab CI/CD pipeline: If you have your project configured with GitLab CI/CD, you will trigger a pipeline per push, not per commit.
  • Skip pipelines: You can add to you commit message the keyword [ci skip], and GitLab CI will skip that pipeline.
  • Cross-link issues and merge requests: Cross-linking is great to keep track of what’s somehow related in your workflow. If you mention an issue or a merge request in a commit message, they will be shown on their respective thread.

CI/CD Pipeline

Continuous Integration works by pushing small code chunks to your application’s code base hosted in a Git repository, and, to every push, run a pipeline of scripts to build, test, and validate the code changes before merging them into the main branch.

 Continuous Delivery and Deployment consist of a step further CI, deploying your application to production at every push to the default branch of the repository.

These methodologies allow you to catch bugs and errors early in the development cycle, ensuring that all the code deployed to production complies with the code standards you established for your app.

Two top-level components are:

  1. .gitlab-ci.yml
  2. GitLab Runner
.gitlab-ci.yml

The .gitlab-ci.yml file is where we configure what CI does with the project. It lives in the root of the repository. On any push to the repository, GitLab will look for the .gitlab-ci.yml file and start jobs on Runners according to the contents of the file, for that commit.

Pipeline configuration begins with jobs. Jobs are the most fundamental element of a .gitlab-ci.yml file.

Jobs are:

  • Defined with constraints stating under what conditions they should be executed
  • Top-level elements with an arbitrary name and must contain at least the script clause
  • Not limited in how many can be defined

For example:

job1:
  script: “execute-script-for-job1”

job2:
  script: “execute-script-for-job2”

GitLab Runner

GitLab Runner is a build instance that is used to run jobs and send the results back to GitLab. It is designed to run on the GNU/Linux, macOS, and Windows operating systems.

Runners run the code defined in .gitlab-ci.yml. They are isolated (virtual) machines that pick-up jobs through the coordinator API of GitLab CI. If we want to use Docker, install the latest version. GitLab Runner requires a minimum of Docker v1.13.0.

Types of Runner :
  1. Specific Runner – useful for jobs that have special requirements or for projects with a specific demand.
  2. Shared Runner – useful for jobs that have similar requirements between multiple projects. Rather than having multiple Runners idling for many projects, you can have a single or a small number of Runners that handle multiple projects.
  3. Group Runner – useful when you have multiple projects under one group and would like all projects to have access to a set of Runners. Group Runners process jobs using a FIFO (First In, First Out) queue.
How to use Runner:
  1. Install Runner  –  https://docs.gitlab.com/runner/#install-gitlab-runner
  2. Register Runner – https://docs.gitlab.com/runner/register/index.html

Sample Docker Project:

  1. Create/Upload 2 files. .gitlab-ci.yml and Dockerfile
File Contents:
.gitlab-ci.yml:

# Official docker image.
image: docker:19.03.0-dind

services:
  - docker:19.03.0-dind

before_script:
  - docker login -u "$CI_REGISTRY_USER" -p "$CI_REGISTRY_PASSWORD" $CI_REGISTRY

build-master:
  stage: build
  script:
    - docker build --pull -t "$CI_REGISTRY_IMAGE" .
    - docker push "$CI_REGISTRY_IMAGE"
  tags:
    - docker

Dockerfile

FROM python:2.7
RUN pip install howdoi
CMD ['howdoi']

Once we create .gitlab-ci.yml file, each push will trigger the pipeline. We didn’t create the Runner yet. So, while creating these files, add “[ci skip]” to commit message. This will skip the CI/CD pipeline.

2. Install Runner

For this example, we are using a specific runner and are going to install a runner in Windows. Refer to this link to Install Runner in Windows: https://docs.gitlab.com/runner/install/windows.html

3. Register Runner:

In order to register a runner, we need a registration token which can be found in Settings > CI/CD > Runner Tab

Check below Specific Runner section. You can copy the token from there.

To register a Runner under Windows run the following command in the path where we install GitLab Runner:
./gitlab-runner.exe register

Enter your GitLab instance URL:
Please enter the gitlab-ci coordinator URL (e.g. https://gitlab.com)
https://gitlab.com

Enter the token you obtained to register the Runner:
Please enter the gitlab-ci token that we copied earlier

Enter a description for the Runner, you can change this later in GitLab’s UI:
Please enter the gitlab-ci description for this Runner
[hostname] my-runner

Enter the tags associated with the Runner, you can change this later in GitLab’s UI:
Please enter the gitlab-ci tags for this Runner (comma separated):
docker

Enter the Runner executor:
Please enter the executor: ssh, docker+machine, docker-ssh+machine, kubernetes, Docker, parallels, virtualbox, docker-ssh, shell:
docker

If you chose Docker as your executor, you’ll be asked for the default image to be used for projects that do not define one in .gitlab-ci.yml:
Please enter the Docker image (eg. ruby:2.6):
python:2.7

Once the Runner is created successfully, it will be displayed under Settings > CI/CD > Runner > Specific Runner section.

4. Now, Go to CI/CD > Pipelines and Click  Run Pipeline Button

It will open a new window. Click Run Pipeline button again.

5. Once the job completed successfully, it will be displayed as below.

Welcome to Cloud View!

This week we look at some of the infrastructure and operations trends and insights on MLOps, Amazon Alexa and Blue green deployment on Azure.

Industry Perspective & News

AWS slashes cost of DR

Celebrate! AWS has announced massive cloud cost reductions on disaster recovery and Kubernetes services

Infrastructure & Operations Trends

Another week, another bunch of predictions. The difference here is that the article also details some trends that are on the wane.  


Tech Insights

MongoDB Supports GraphQL

MongoDB is now supporting GraphQL language for accessing its serverless application platform. This promises to extend their technology towards web and mobile apps.


CircleCI Orbs

In just a year since its launch, CircleCI Orbs are being used by over 13,000 organizations in close to 9 million CI/CD pipelines. New collaborations with 20 partners now extend the Orbs ecosystem even further.


MLOps

MLOps – the collaborative best practices that accelerate the machine lifecycle across model development, deployment, monitoring, and more – can give organizations a massive edge against competitors. John ‘JG’ Chirapurath, General Manager, Azure Data & AI explains more in this blog.


From CloudIQ

Amazon Alexa Custom Skills – How to Build One Step-by-Step

Amazon’s Alexa is the voice activated and interactive AI Bot designed to respond to number of commands and converse with people. Alexa Skills are apps that give Alexa even more abilities.

Blue Green Deployment on Azure, Safe Strategy with Zero Downtime

In Azure, different processes are available for implementing the Blue-Green strategy with two environments. In this article we discuss some of these techniques.

Cybersecurity is the number one concern for CEOs and is unanimously seen as the biggest threat in the coming years. Reports suggest that the damages from cyberattacks will to amount to $6 trillion annually by 2021.

While a lot of news coverage is given to malicious hackers and ransomware attacks, another crucial area of cyber protection is tightening the internal defenses with intelligent identity management. Keeping a tight control on who can get past your firewalls is vital for maintaining optimum security.

In this article we will review the comprehensive set of security tools available in Azure Cloud.

Azure Active Directory

Multi-Factor Authentication

Azure Multi-Factor Authentication (MFA) helps safeguard access to data and applications while maintaining simplicity for users. It provides additional security by requiring a second form of authentication and delivers strong authentication via a range of easy to use authentication methods. Users may or may not be challenged for MFA based on configuration decisions that an administrator makes.

The security of two-step verification lies in its layered approach. Compromising multiple authentication factors presents a significant challenge for attackers. Even if an attacker manages to learn the user’s password, it is useless without also having possession of the additional authentication method. It works by requiring two or more of the following authentication methods: Something you know (typically a password), Something you have (a trusted device that is not easily duplicated, like a phone), Something you are (biometrics)

Conditional Access policies

Conditional Access is the tool used by Azure Active Directory to bring signals together, to make decisions, and enforce organizational policies. Conditional Access policies at their simplest are if-then statements; if a user wants to access a resource, then they must complete an action. Example: A payroll manager wants to access the payroll application and is required to perform multi-factor authentication to access it.

Azure AD identity protection

Identity Protection is a tool that allows organizations to accomplish three key tasks: Automate the detection and remediation of identity-based risks, Investigate risks using data in the portal, Export risk detection data to third-party utilities for further analysis. The signals generated by and fed to Identity Protection, can be further fed into tools like Conditional Access to make access decisions, or fed back to a security information and event management (SIEM) tool for further investigation based on your organization’s enforced policies.

Azure AD Privileged Identity Management

Privileged Identity Management provides time-based and approval-based role activation to mitigate the risks of excessive, unnecessary, or misused access permissions on resources that you care about. Here are some of the key features of Privileged Identity Management:

  • Provide just-in-time privileged access to Azure AD and Azure resources
  • Assign time-bound access to resources using start and end dates
  • Require approval to activate privileged roles
  • Enforce multi-factor authentication to activate any role
  • Use justification to understand why users activate
  • Get notifications when privileged roles are activated
  • Conduct access reviews to ensure users still need roles
  • Download audit history for internal or external audit

Network Security

Network Security Groups (NSGs)

Network security group security rules are evaluated by priority using the 5-tuple information (source, source port, destination, destination port, and protocol) to allow or deny the traffic. A flow record is created for existing connections. Communication is allowed or denied based on the connection state of the flow record. The flow record allows a network security group to be stateful. If you specify an outbound security rule to any address over port 80, for example, it’s not necessary to specify an inbound security rule for the response to the outbound traffic. You only need to specify an inbound security rule if communication is initiated externally. The opposite is also true. If inbound traffic is allowed over a port, it’s not necessary to specify an outbound security rule to respond to traffic over the port. Existing connections may not be interrupted when you remove a security rule that enabled the flow. Traffic flows are interrupted when connections are stopped, and no traffic is flowing in either direction, for at least a few minutes.

Azure Firewall

With Azure Firewall, you can configure – Application rules that define fully qualified domain names (FQDNs) that can be accessed from a subnet and Network rules that define source address, protocol, destination port, and destination address. Network traffic is subjected to the configured firewall rules when you route your network traffic to the firewall as the subnet default gateway.

Application security groups

Application security groups enable you to configure network security as a natural extension of an application’s structure, allowing you to group virtual machines and define network security policies based on those groups. You can reuse your security policy at scale without manual maintenance of explicit IP addresses. The platform handles the complexity of explicit IP addresses and multiple rule sets, allowing you to focus on your business logic.

Resource management security

Azure resource locks

As an administrator, you may need to lock a subscription, resource group, or resource to prevent other users in your organization from accidentally deleting or modifying critical resources. You can set the lock level to CanNotDelete or ReadOnly. In the portal, the locks are called Delete and Read-only, respectively. CanNotDelete means authorized users can still read and modify a resource, but they can’t delete the resource. ReadOnly means authorized users can read a resource, but they can’t delete or update the resource. Applying this lock is similar to restricting all authorized users to the permissions granted by the Reader role.

Azure policies

Azure Policy is a service in Azure that you use to create, assign, and manage policies. These policies enforce different rules and effects over your resources, so those resources stay compliant with your corporate standards and service level agreements. Azure Policy meets this need by evaluating your resources for non-compliance with assigned policies. All data stored by Azure Policy is encrypted at rest. For example, you can have a policy to allow only a certain SKU size of virtual machines in your environment. Once this policy is implemented, new and existing resources are evaluated for compliance.

Custom RBAC roles

Granting permission using custom Azure AD roles is a two-step process that involves creating a custom role definition and then assigning it using a role assignment. A custom role definition is a collection of permissions that you add from a preset list. These permissions are the same permissions used in the built-in roles.

Once youíve created your role definition, you can assign it to a user by creating a role assignment. A role assignment grants the user the permissions in a role definition at a specified scope. This two-step process allows you to create a single role definition and assign it many times at different scopes. A scope defines the set of Azure AD resources the role member has access to.

Encryption for data at rest

Azure SQL Database Always Encrypted

Always Encrypted is a new data encryption technology in Azure SQL Database and SQL Server that helps protect sensitive data at rest on the server during movement between client and server, and while the data is in use, ensuring that sensitive data never appears as plaintext inside the database system. After you encrypt data, only client applications or app servers that have access to the keys can access plaintext data.

Implement database encryption

Transparent data encryption (TDE) helps protect Azure SQL Database, Azure SQL Managed Instance, and Azure Data Warehouse against the threat of malicious offline activity by encrypting data at rest. It performs real-time encryption and decryption of the database, associated backups, and transaction log files at rest without requiring changes to the application. By default, TDE is enabled for all newly deployed Azure SQL databases.

Implement Storage Service Encryption

Data in Azure Storage is encrypted and decrypted transparently using 256-bit AES encryption, one of the strongest block ciphers available, and is FIPS 140-2 compliant. Azure Storage encryption is similar to BitLocker encryption on Windows.

Azure Storage encryption is enabled for all new storage accounts, including both Resource Manager and classic storage accounts. Azure Storage encryption cannot be disabled. Because your data is secured by default, you don’t need to modify your code or applications to take advantage of Azure Storage encryption.

Implement disk encryption

Azure Disk Encryption helps protect and safeguard your data to meet your organizational security and compliance commitments. It uses the Bitlocker feature of Windows to provide volume encryption for the OS and data disks of Azure virtual machines (VMs), and is integrated with Azure Key Vault to help you control and manage the disk encryption keys and secrets.

Configure application security

Configure SSL/TLS certs

If you purchase an App Service Certificate from Azure, Azure manages the following tasks: Takes care of the purchase process from GoDaddy, Performs domain verification of the certificate, Maintains the certificate in Azure Key Vault, Manages certificate renewal (see Renew certificate), Synchronize the certificate automatically with the imported copies in App Service apps.

Configure and Manage Key Vault

Manage access to Key Vault

Azure Key Vault is a cloud service that safeguards encryption keys and secrets like certificates, connection strings, and passwords. Because this data is sensitive and business-critical, you need to secure access to your key vaults by allowing only authorized applications and users.

Access to a key vault is controlled through two interfaces: the management plane and the data plane. The management plane is where you manage Key Vault itself. Operations in this plane include creating and deleting key vaults, retrieving Key Vault properties, and updating access policies. The data plane is where you work with the data stored in a key vault. You can add, delete, and modify keys, secrets, and certificates.

To access a key vault in either plane, all callers (users or applications) must have proper authentication and authorization. Authentication establishes the identity of the caller. Authorization determines which operations the caller can execute.

Both planes use Azure Active Directory (Azure AD) for authentication. For authorization, the management plane uses role-based access control (RBAC), and the data plane uses a Key Vault access policy.

Welcome to Cloud View!

This week we look at the worldwide IT spending for 2020, aligning IoT and AI with business goals and the Kubernetes bug bounty program.

Tech Insights

IT Spending 2020

Gartner starts the year with some great news – Worldwide IT spending is projected to increase by 3.4% to $3.9 trillion in 2020. Get more insights from the Gartner report here.

Adoption of IoT and AI

To get the full benefits of IoT and AI, their adoption will need to merge with business strategies and goals. And that will change the way businesses are structured.


Industry Insights

Kubernetes-based Red Hat OpenShift 4.3 

Red Hat’s cloud-native commitment stays strong! The company just released its Kubernetes-based Red Hat OpenShift 4.3 and Red Hat OpenShift Container Storage 4 to provide multi-cloud Kubernetes container support.


Kubernetes bug bounty program

Kubernetes Product security committee is launching a new bug bounty program to tap into the power of the highly active Kubernetes community to find vulnerabilities in the software. Find out how you can get started.


From CloudIQ

Introduction to Machine Learning and How It Works

In this article we look at the basics of machine learning, the different algorithm models and a simple machine learning algorithm example using Python.

How to build real-time streaming data pipelines and applications using Apache Kafka?

In this article we will discuss how to use Apache Kafka, the distributed publish-subscribe messaging system and to pass messages from one end-point to another.

Welcome to Cloud View!

This week we look at some of the technologies of the future and insights on container performance & security, connected vehicles, and chatbots.

Latest in Tech

CES 2020

2020 starts with the biggest tech event of the year – CES 2020. The Las Vegas event showcases the technology of the future! Here are the highlights of some of the enterprise technologies on display.

Technology 2020

Want to know the latest technology trends that will impact businesses in 2020? From the empowered edge to human augmentation and more, here are 15 of them.


Industry Insights

Service Mesh

Containers are dominating the software world, but despite their popularity and orchestration software like Kubernetes, they are still challenging to manage. Service meshes come as the answer to improving container performance and security.


Connected Vehicles

The world of vehicle software is heating up! BlackBerry QNX and AWS are targeting automotive OEMs to bring services, personalization, health monitoring, and advanced driver assistance (ADAS) to vehicles.


Chatbots

In the next couple of years, 70% of white-collar workers will chat with conversation AI platforms daily – predicts Gartner. Here is a case study of improving customer service with an intelligent virtual assistant using IBM Watson.


From CloudIQ

Azure Database for MySQL and Grafana to monitor Azure services

More and more organizations run their business-critical applications on containers using Azure and that calls for a more intuitive dashboard to monitor and track Azure Services.

How to Create and Run Spark Clusters with Qubole using AWS

Qubole is a platform that puts big data on the cloud to power business decisions based on real-time analytics. Here is how you can create and run Spark Clusters with Qubole using AWS.

As more and more organizations run their business-critical applications on containers using Azure, there are new challenges in monitoring and managing them. Of course, there is the Azure dashboard, but with elaborate set-ups and such, IT teams feel the need for a more intuitive dashboard to monitor and track Azure services.

The answer, Grafana.

Grafana is an open-source dashboard and graph editor for Graphite, Elasticsearch, OpenTSDB, Prometheus, and InfluxDB. It is a powerful visualization application that deals effectively with large-scale measurement data and time-series data.

As compared to other dashboards, especially the native Azure dashboard, Grafana offers a wider variety of visualization options (graphs, heatmaps, tables, and more) and can collect and collate data from multiple sources. It is designed for evaluating metrics such as system CPU, memory, disk, and I/O utilization.

A Grafana dashboard will help you understand, analyze, monitor, and explore your data with flexible and fast visualization tools.

In this article we will look at using Azure Database for MySQL and Grafana to monitor Azure services

Access Requirements

In your Azure subscription, your account must have “Microsoft.Authorization/*/Write” access to assign and AD app to a role. This action is granted through “Owner” role or “User Access Administrator” role. “Contributor” role will not have the required permissions.

Virtual machine requirements,
  • VM Operating System      :  Linux (ubuntu 18.04)
  • VM Size                                   :  Standard D2s v3 (2 vcpus, 8 GiB memory) is more than enough
  • SSH access                            :  username and password.
  • Default port                       :  3000
  • NSG rule                                 :  open an inbound rule in network security group with
                                                        limited access to port 3000 and 22 for SSH.
  • Assign a static public IP address to the VM

MySQL Creation and Linking to Grafana

1. Create an Azure database for MySQL server from

2. Select the resource group, provide Server name, admin username, password, confirm password. Take a note of the password; it is used several times throughout the set-up.

3. To select compute and storage,

a. There are three pricing tiers, (choose basic)

  • Basic
  • General-purpose
  • Memory-optimized

b. Select the appropriate sizes.

  • Computer generation: Gen 5
  • vCore: 1
  • storage: 5GB
  • Auto-growth:
  • Backup retention period:
  • Local redundant / Geo-Redundant

For basic compute and storage,

The maximum vcore is 2, and Storage is 1024 GB. Choose as per your needs.

4. Then click Review+Create

5. Once the Azure database for MYSQL server is deployed, go to connection security and do the following changes,

  • Add a client IP.
  • Set “Allow access to Azure services” to ON

6. Do the following in the SQL server by connecting to it using the Server admin login name and password in SQL workbench. Create a new query tab. (You can use any tool to connect MySQL)

7. Run the following commands in the query tab,

  • create database named “grafana” ;

8. Now the SQL server-side configurations are over. We need to provide the inputs of SQL server configuration to docker containers running Grafana.

9. Login to the VM running Grafana using appropriate SSH credentials (password or access keys).

10. Note the following values and save them as environment variables as an environment list.

Type                =          mysql
Host                =          <servername>:3306 (mysql server name created earlier)
Name              =          grafana (DB name given in the earlier steps and given access)
User                =          <Server admin login name>
Password       =          <server login password>

11. Save the changes mentioned above as a list. As shown,

Installing Grafana as a docker container and required its plugins

1. Login to the server using appropriate credentials,

2. Get updates using, sudo apt-get update

3. Install docker using the command, sudo apt install docker.io

4. Enable and start docker,

  • sudo systemctl start docker
  • sudo systemctl enable docker

5. Verify the installation using the command,

  • docker –version the result will be like this,

6. Now login as root using the command,

  • sudo su

7. Pull Grafana image; this needs an Internet connection as this will download the image from a public hub of docker

  • docker pull grafana/grafana

8. Run the image with saved environment variables,

  • docker run -d –name=grafana -p 8000:3000 –env-file ./env.list grafana/grafana

9. verify the container installation using “docker ps” command

10. The next step is to install the plugins for Grafana, which will be used in setting up the dashboard. We need to login to the container created previously to install these plugins.

11. Now create a shell inside the container using,

  • docker exec -it grafana /bin/bash

12. The result will be as shown,

13. By default, in Grafana dashboard there’ll be limited number of panel plugins, to use more visualization we can manually install plugins. Now copy the plugin installation commands listed below and run them one by one or everything at once.

  • grafana-cli plugins install michaeldmoore-annunciator-panel
  • grafana-cli plugins install grafana-piechart-panel
  • grafana-cli plugins install farski-blendstat-panel
  • grafana-cli plugins install michaeldmoore-multistat-panel
  • grafana-cli plugins install grafana-polystat-panel
  • grafana-cli plugins install flant-statusmap-panel
  • grafana-cli plugins install grafana-clock-panel
  • grafana-cli plugins install neocat-cal-heatmap-panel
  • grafana-cli plugins install briangann-gauge-panel
  • grafana-cli plugins install natel-plotly-panel

14. Once the plugins are installed,

  • verify the installation by going into, /var/lib/grafana/plugins directory by using the commands listed below
  • cd  /var/lib/grafana/plugins
  • to view installed plugins, use ls command

15. Now exit the container, command: exit

16. Now restart the container using,

  • docker container restart Grafana (here “grafana” refers to the container name created earlier, which can be found using docker ps command)

Linking Azure Monitoring Tools

Service Principal

  • Register an app in Azure Active Directory (AD).
  • Create a client secret in the Registered app.
  • Go to subscription à IAM à search for the app registration and provide “READER” access to the registered app in the Azure AD in first place.

Applying the service principal to Grafana,

  • Go to Grafana UI by using public IP address followed by port number,
    i.e., (IP address):3000 example: 13.25.49.164:3000
  • Now Add data source. And click select
  • This page will appear, input the tenant ID, client ID, client secret.
  • Then provide details for log analytics workspace and Application insights.
  • If the provided details are correct this message should be displayed.

Welcome to Cloud View!

We hope you had a joyful and fun holiday! NOW with the festivities behind us, it’s time for business again!

Let’s start with what IT Leaders are planning for 2020.

Leader’s Speak

CIO’s plan for 2020

What are the CIOs thinking and planning for the coming year? It seems finding talent, dealing with rising security problems, and prioritizing the acquisition of new technologies are some of the topics occupying the C-suite.

IT Industry in 2020

Joel Friedman, CTO, Rackspace offers his take on what 2020 will hold for the IT industry. According to Joel, hybrid, and multi-cloud, SaaS and security problems will grow;

A decade since the launch of Azure

2020 marks a decade since the launch of Azure. Ever wondered what the founders think about their creation? Here’s a short interview with Microsoft’s Yousef Khalidi and Hoi Vo, key members of the original Azure ‘dream team’.


Industry insights

AI Solutions

As a digital-first service provider, we at CloudIQ help implement AI solutions across organizations. Impact like this is what we aim for! This is what the future of AI looks like.


Top Big Data companies to watch for in 2020

Big Data is going to dominate many a boardroom in the coming few years. We start the year by tracking some promising companies that will define the coming year with their next-generation data management, data science, and machine learning technology.


From CloudIQ

Deploying a Pod containing three applications using Jenkins CI/CD pipeline and updating them selectively

Kubernetes pod is a layer of abstraction wrapped around containers to group them together for resource allocation and efficient management. Here is how to deploy a pod containing three applications using Jenkins CI/CD pipeline and update them selectively.

Provisioning Cloud Infrastructure using AWS CloudFormation Templates

Spend less time managing cloud infrastructure and focus on building your application, thanks to AWS CloudFormation templates. Here is a quick start guide to creating the templates for provisioning cloud infrastructure.

A Kubernetes pod – incidentally, some say it is named after a whale pod because the docker logo is a whale – is the foundational unit of execution in a K8s ecosystem. While docker is the most common container runtime, pods are container agnostic and support other containers as well.

Simply put, a K8s pod is a layer of abstraction wrapped around containers to group them together to allocate resources and to manage them efficiently.

Continuous integration and delivery or CI/CD is a critical part of DevOps ecosystems, especially for cloud-native applications. DevOps teams frequently use Jenkins CI/CD pipelines to increase the speed and quality of collaborated software development ecosystems by adding automation. Thanks to Helm, deploying Jenkins server to K8s is quick and easy. The difficult bit is building the pipeline.

Here is a post that describes how to deploy a pod containing three applications using a Jenkins CI/CD pipeline and update them selectively.

Task on Hand:

Use a Jenkins pipeline to build a spring-boot application to generate jar file, dockerize the application to create a docker image, push the image to a docker repository and pull the image into the pod to create containers. The pod should contain 3 containers for the three applications, respectively. Upon git commit, only the container for which there is a change must be updated (rolling update).

Steps
  1. Create a pipeline using groovy script to clone the respective git repo, build the project using maven, build the docker images, push it to dockerhub and pull these images to run containers in the pod.
  2. Repeat the steps for all the three applications in separate stages. Make sure to create a separate directory in each stage to prevent conflicts when using similar files. Also, this clones the different git repos into different folders to avoid confusion.
  3. Here is the Jenkinsfile/Pipeline script to perform the above task:
pipeline {
agent any
stages {
stage('Build1'){
    steps{
        dir('app1'){
            script{
                git 'https://github.com/cloud/simple-spring.git'
                sh 'mvn clean install'
                app = docker.build("cloud007/simple-spring")
                docker.withRegistry( "https://registry.hub.docker.com", "dockerhub" ) {
                // dockerImage.push()
                app.push("latest")
            }
        }
    }
}

}
stage('Build2'){
    steps{
        dir('app2'){
            script{
                git 'https://github.com/cloud/simple-spring-2.git'
                sh 'mvn clean install'
                app = docker.build("cloud007/simple-spring-2")
                docker.withRegistry( "https://registry.hub.docker.com", "dockerhub" ) {
                // dockerImage.push()
                app.push("latest")
            }
        }
    }
}

}
stage('Build3'){
    steps{
        dir('app3'){
            script{
                git 'https://github.com/cloud/simple-spring-3.git'
                sh 'mvn clean install'
                app = docker.build("cloud007/simple-spring-3")
                docker.withRegistry( "https://registry.hub.docker.com", "dockerhub" ) {
                // dockerImage.push()
                app.push("latest")
            }
        }
    }
}
}
stage('Orchestrate')
{
    steps{
        script{
    sh 'kubectl apply -f demo.yaml'
        }
    }
}

}
}

4. Make sure to properly configure docker and expose the dockerd in port 4243 and then change permission to allow Jenkins to use docker commands by changing permission for the /var/run/docker.sock shown.

5. Coming to integrating Kubernetes with Jenkins, it can be done using two plugins:

  • Kubernetes plugin: When using this plugin, we configure the credentials to use our local cluster/azure cluster and specify the container templates for the containers to be created in the pipeline. But since all the tasks must run in containers, it is a little confusing approach. A better approach would be to use the kuberentes-cli plugin.

Refer: https://github.com/jenkinsci/kubernetes-plugin

  • Kubernetes-cli plugin: It provides a withconfig() for pipeline support, which uses the configure credentials to connect to our cluster and perform kubectl commands. However when running the pipeline, the kubeconfig wasn’t  recognized for some reason, and kept giving the error ‘file not found’.

Refer: https://github.com/jenkinsci/kubernetes-cli-plugin/blob/master/README.md

Hence, we installed kubectl on the Jenkins host, configured the cluster manually, and ran shell commands from the Jenkins pipeline, where Jenkins was recognized as an anonymous user and was only granted get access but couldn’t create pods/deployments.

Here are some common problems faced during this process and the troubleshooting procedure.

  • Configuring Jenkins to use local minikube cluster:
    We had trouble using both the plugins to properly configure Jenkins to create deployments as required. Using shell commands to run kubectl was also not successful since Jenkins was recognized as an anonymous user, and authorization prevented anonymous users from creating deployments.
  • Permission for /var/run/docker.sock is reset to root after every restart, so make sure to change it to allow Jenkins to continue to use docker commands: choose Jenkins /var/run/docker.sock
  • Installing Minikube:
    i) Started minikube cluster using hyperv as the driver and created a virtual    switch:
    minikube start –vm-driver=hyperv  –hyperv-virtual-switch=”Primary Virtual Switch”

    ii) Installation takes a lot of time, so we have to wait patiently, and eventually, the cluster will get configured and started. If there is a problem with apiserver, then stop the machine after SSHing into minikube vm:
    minikube ssh
    sudo poweroff

    iii) Then start minikube the same way.

Here are some suggested best practices

Maintaing git repo:
  • Branching must be used while updating the source code or adding a particular file to the repository. Suppose you want to add a readme, then create a new branch from master, create the readme and commit it and then merge the branch with the origin.
  • Similarly, for adding some test files/changing source code – create a new branch for testing/modification, update and commit the code and merge with the master when finished. The purpose of this is to allow easy roll back to the original master if you run into some errors when working with the new files and to prevent any conflict in code with the master.
  • Commit only after you have tested the code properly, never commit incomplete code.
  • Write good commit messages to keep track of the changes you have made.
Versioning:
  • Follow the versioning convention X.Y.Z where X is incremented for a new major update/feature, Y is incremented for minor updates/minor features and Z for minor patches/bug fixes
  • Avoid version lock that has too many dependencies in a single version. In such a scenario the package can only be updated after releasing new versions for every dependent package.
Docker repo:
  • Use unique tags for deployment/pushing images to the repository.
  • Always use stable tags for building images but never deploy/pull images using stable tags. Stable tags are the ones that do not roll over the updates to future versions but are bound to the current version (tag). 

Welcome to Cloud View!

The last two days of 2019 feel a bit like a waiting period – it’s a tad early to start celebrating, but it’s hard to plan anything until the new year celebrations are truly behind us and we are back at work. We think a good way to use this time would be to indulge in a bit of nostalgia and look back at how the technology landscape evolved in 2019.

Recap 2019

Kubernetes Podcast in 2019

If you deal with Kubernetes, then we are sure you follow the Kubernetes Podcast. Here is the roundup of the year’s best! Enjoy!

CRN’s 2019 Year in Review

LOVE ‘Top-10’ listicles? Here is a mammoth list of technology top 10s by CRN. From top 10 cybersecurity stories to the top 10 mobile apps of 2019 – it’s all here!

Top 10 Smarter with Gartner Articles for 2019

No report, survey, or listicle is complete without Gartner. Here are the 10 best “Smarter with Gartner articles from 2019”. 


Industry insights

What’s New with AWS

A quick video to recap the latest AWS updates and announcements (there are many!) and a web link, too, in case you want to explore categories in more detail.

IBM Z Open Editor Support for LSP

Any programmer can attest to the power – and the usefulness – of Language Server Protocol (LSP). Now its integration with IBM’s Z Open Editor opens a whole new level for coders across the globe.


From CloudIQ

Creating AWS Security Groups for Kubernetes

We are going to discuss creating security groups in AWS for Kubernetes. The goal is to set up a Kubernetes cluster on AWS EC2, having provisioned your virtual machines.

Deploy a Spring-Boot Application in Kubernetes Pod using Jenkins CI/CD Pipeline

Kubernetes has become the preferred platform of choice for container orchestration. Here is a walkthrough of how Jenkins CI/CD pipeline is used to deploy a spring boot application in K8s.

Signing off now and will see you all in the new year. Party hard and bring in 2020 in style.

Kubernetes has become the preferred platform of choice for container orchestration and deliver maximum operational efficiency. To understand how K8s works, one must understand its most basic execution unit – a pod.

Kubernetes doesn’t run containers directly, rather through a higher-level structure called a pod. A pod has an application’s container (or, in some cases, multiple containers), storage resources, a unique network IP, and options that govern how the container(s) should run.

Pods can hold multiple containers or just one. Every container in a pod will share the same resources and network. A pod is used as a replication unit in Kubernetes; hence, it is advisable to not add too many containers in one pod. Future scaling up would lead to unnecessary and expensive duplication.

To maximize the ease and speed of Kubernetes, DevOps teams like to add automation using Jenkins CI/CD pipelines. Not only does this make the entire process of building, testing, and deploying software go faster, but it also minimizes human error. Here is how Jenkins CI/CD pipeline is used to deploy a spring boot application in K8s.

TASK on Hand:

Create a Jenkins pipeline to dockerize a spring application, build docker image, push it to the dockerhub repo and then pull the image into an AKS cluster to run it in a pod.

Complete repository:

All the files required for this task are available in this repository:
https://github.com/saiachyuth5/simple-spring

Pre-Requisites:

A spring-boot application, dockerfile to containerize the application.

STEPS:

1. Install Jenkins :

2. Connect host docker daemon to Jenkins:

  • Run the command: chmod –R Jenkins:docker filename/foldername to allow Jenkins to access docker.
  • Go to manage Jenkins from browser >Configure System and scroll to the bottom
  • Click the dropdown ‘add cloud’ and add Docker. Add the docker host URI in the format tcp://hostip:4243
  • Click verify connection to check your connection. If everything was done right, the docker version is displayed.

3. Adding global credentials:

  • Go to Credentials on the Jenkins dashboard, click global credentials, and then Add credentials.
  • Select the kind as Microsoft Azure Service Principal and enter the required ids, similarly save the docker credentials under type username with password.

4. Create the Jenkinsfile :

  • Refer to the official Jenkins documentation for the pipeline syntax, usage of Jenkinsfile, and simple examples.
  • Below is the Jenkinsfile used for this task.

Jenkinsfile:

NOTE: While this example uses actual id to login to Azure, its recommended to use credentials to avoid using exact parameters.

pipeline {
environment {
registryCredential = "docker"
}
agent any
stages {
stage(‘Build’) {
    steps{
    script {
        sh 'mvn clean install'
    }
    }
}
stage(‘Load’) {
    steps{
    script {
        app = docker.build("cloud007/simple-spring")
    }
    }
}
    stage(‘Deploy’) {
    steps{
    script {
        docker.withRegistry( "https://registry.hub.docker.com", registryCredential ) {
        // dockerImage.push()
        app.push("latest")
        }
    }
    }
}
stage('Deploy to ACS'){
    steps{
        withCredentials([azureServicePrincipal('dbb6d63b-41ab-4e71-b9ed-32b3be06eeb8')]) {
        sh 'echo "logging in" '
        sh 'az login --service-principal -u **************************** -p ********************************* -t **********************************’
        sh 'az account set -s ****************************'
        sh 'az aks get-credentials --resource-group ilink --name    mycluster'
        sh 'kubectl apply -f sample.yaml'
    }
}
}
}
}

5. Create the Jenkins project:

  • Select New view>Pipeline and click ok.
  • Scroll to the bottom and select the definition as Pipeline from SCM.
  • Select the SCM as git and enter the git repo to be used, path to Jenkinsfile in Script path.
  • Click apply, and the Jenkins project has now been created.
  • Go to my views, select your view, and click on build to build your project.

6. Create and connect to Azure Kubernetes cluster:

  • Create an Azure Kubernetes cluster with 1-3 nodes and add its credentials to global credentials in Jenkins.
  • Install azure cli on the Jenkins host machine.
  • Use shell commands in the pipeline to log in, get-credentials, and then create a pod using the required yaml file.

YAML used:

apiVersion: apps/v1
kind: Deployment
metadata:
    name: spring-helloworld
spec:
    replicas: 1
    selector:
    matchLabels:
        app: spring-helloworld
    template:
    metadata:
        labels:
        app: spring-helloworld
    spec:
        containers:
        - name: spring-helloworld
        image: cloud007/simple-spring:latest
        imagePullPolicy: Always
        ports:
        - containerPort: 80

Here are some common problems faced during this process and the troubleshooting procedure.

  • Corrupt Jenkins exec file:
    Solved by doing an apt-purge and then apt-install Jenkins.
  • Using 32-bit VM:
    Kubectl is not supported on a 32-bit machine and hence make sure the system is 64-bit.
  • Installing azure cli manually makes it inaccessible for non-root users
    Manually installing azure cli placed it in the default directories, which were not accessible by non-root users and hence by Jenkins. So, it is recommended to install azure cli using apt.
  • Installing minikube using local cluster instead of AKS:
    Virtual box does not support nested VTx-Vtx virtualization and hence cannot run minikube. It is recommended to enable Hyperv and use HyperV as the driver to run minikube.
  • Naming the stages in the Jenkinsfile:
    Jenkins did not accept when named stage as ‘Build Docker Image’ or multiple words for some reason. Use a single word like ‘Build’, ‘Load’ etc…
  • Jenkins stopped building the project when the system ran out of memory:
    Make sure the host has at least 20 GB free in the hard disk before starting the project.
  • Jenkins couldn’t execute docker commands:
    Try the command usermog –a  –G docker Jenkins
  • Spring app not accessible from external IP:
    Created a new service with type loadbalancer, assigned it to the pod, and the application was accessible from this new external ip.

In an upcoming article we will show you how to deploy a pod containing three applications using Jenkins ci/cd pipeline and update them selectively.

Welcome to Cloud View!

This week we present to you some fresh articles on the evolution of some of the most dominant technologies for the coming decade.

Predictions

Blockchain in 2020

2019 has been a year of great highs and some lows for the blockchain technology. The next year is going to see some significant growth and consolidation of large blockchain protocols and digital assets.

Cybersecurity in 2020

Everyone agrees that security is going to be ALL IMPORTANT in the coming year. Here is an article that positions 2020 as the year of the breach. We guarantee you will bump security to the top of your list after reading this.


Industry insights

New features in Azure Monitor Metrics Explorer

A few months ago, Microsoft’s Azure clients gave some feedback regarding the use of metrics in Azure Portal. Now the Microsoft team comes back with some new features which address the main concerns of the community.


The Update Framework (TUF)

The ninth to join the CNCF’s list of mature technologies – The Update Framework (TUF) is an open-source technology that secures software update systems.


From CloudIQ

Kubernetes Deployment Controller – An Inside Look

Kubernetes Deployment Controller helps monitor and manage the upgrade, downgrade, and scaling of services without any disruption or downtime. Here’s a detailed look at the inner workings of Kubernetes Deployment Controller.

Implementing Azure AD Pod Identity in AKS Cluster

Cloud-based identity and access management service becomes a necessity for connecting pods in AKS cluster to access other Azure cloud resources and services. Here is a detailed look at how Azure AD Pod Identity helps.

Container and container orchestration have become the default system for any DevOps team that wants to scale on-demand, reduce costs, and deliver faster. And to get the best out of container technology, Kubernetes is the way to go. A recommended Kubernetes practice is to manage pods through a Deployment; this way, they can be monitored and restarted if a failure occurs.

A deployment is created by using a Kubernetes Deployment Controller object. The application (in a container) is deployed to Kubernetes by declaratively passing a desired state to the Kubernetes Deployment Controller. A K8s deployment controller object is utilized for monitoring, management of upgrade, downgrade, and scaling of services (e.g., pods) without any disruption or downtime. This is made possible because the deployment controller is the single source of truth for the sizes of new and old replica sets. It maintains multiple replica sets, and when you describe a desired state, the DC changes the actual state at the correct pace.

Here’s a detailed look at the inner workings of Kubernetes Deployment Controller

K8s deployment controller is responsible for the following functions

– Managing a set of pods in the form of Replica Sets & Hash-based labels
– Rolling out new versions of application through new Replica Sets
– Rolling back to old versions of application through old Replica Sets
– Pause & Resume Rollout/Rollback functions
– Scale-Up/Down functions

“The Kubernetes controller manager is a daemon that embeds the core control loops shipped with Kubernetes. In applications of robotics and automation, a control loop is a non-terminating loop that regulates the state of the system. In Kubernetes, a controller is a control loop that watches the shared state of the cluster through the apiserver and makes changes attempting to move the current state towards the desired state. Examples of controllers that ship with Kubernetes today are the replication controller, endpoints controller, namespace controller, and serviceaccounts controller.”

func NewControllerInitializers(loopMode ControllerLoopMode) map[string]InitFunc {
        controllers := map[string]InitFunc{}
        controllers["endpoint"] = startEndpointController
        controllers["endpointslice"] = startEndpointSliceController
        controllers["replicationcontroller"] = startReplicationController
        controllers["podgc"] = startPodGCController
        controllers["resourcequota"] = startResourceQuotaController
        controllers["namespace"] = startNamespaceController
        controllers["serviceaccount"] = startServiceAccountController
        controllers["garbagecollector"] = startGarbageCollectorController
        controllers["daemonset"] = startDaemonSetController
        controllers["job"] = startJobController
        controllers["deployment"] = startDeploymentController
        controllers["replicaset"] = startReplicaSetController
        controllers["horizontalpodautoscaling"] = startHPAController
        controllers["disruption"] = startDisruptionController
        controllers["statefulset"] = startStatefulSetController
        controllers["cronjob"] = startCronJobController
        controllers["csrsigning"] = startCSRSigningController
        controllers["csrapproving"] = startCSRApprovingController
        controllers["csrcleaner"] = startCSRCleanerController
        controllers["ttl"] = startTTLController
        controllers["bootstrapsigner"] = startBootstrapSignerController
        controllers["tokencleaner"] = startTokenCleanerController
        controllers["nodeipam"] = startNodeIpamController
        controllers["nodelifecycle"] = startNodeLifecycleController
 	if loopMode == IncludeCloudLoops {
                controllers["service"] = startServiceController
                controllers["route"] = startRouteController
                controllers["cloud-node-lifecycle"] = startCloudNodeLifecycleController
                // TODO: volume controller into the IncludeCloudLoops only set.
        }
        controllers["persistentvolume-binder"] = startPersistentVolumeBinderController
        controllers["attachdetach"] = startAttachDetachController
        controllers["persistentvolume-expander"] = startVolumeExpandController
        controllers["clusterrole-aggregation"] = startClusterRoleAggregrationController
        controllers["pvc-protection"] = startPVCProtectionController
        controllers["pv-protection"] = startPVProtectionController
        controllers["ttl-after-finished"] = startTTLAfterFinishedController
        controllers["root-ca-cert-publisher"] = startRootCACertPublisher

        return controllers
}

Let’s look at inside workings of “Deployment” Controller. It watches for following object updates.

func startDeploymentController(ctx ControllerContext) (http.Handler, bool, error) {
        if !ctx.AvailableResources[schema.GroupVersionResource{Group: "apps", Version: "v1", Resource: "deployments"}] {
                return nil, false, nil
        }
        dc, err := deployment.NewDeploymentController(
                ctx.InformerFactory.Apps().V1().Deployments(),
                ctx.InformerFactory.Apps().V1().ReplicaSets(),
                ctx.InformerFactory.Core().V1().Pods(),
                ctx.ClientBuilder.ClientOrDie("deployment-controller"),
        )
        if err != nil {
                return nil, true, fmt.Errorf("error creating Deployment controller: %v", err)
        }
        go dc.Run(int(ctx.ComponentConfig.DeploymentController.ConcurrentDeploymentSyncs), ctx.Stop)
        return nil, true, nil
}

The “Deployment Controller” initializes the following Event handlers.

dInformer.Informer().AddEventHandler(cache.ResourceEventHandlerFuncs{
                AddFunc:    dc.addDeployment,
                UpdateFunc: dc.updateDeployment,
                // This will enter the sync loop and no-op, because the deployment has been deleted from the store.
                DeleteFunc: dc.deleteDeployment,
        })
        rsInformer.Informer().AddEventHandler(cache.ResourceEventHandlerFuncs{
                AddFunc:    dc.addReplicaSet,
                UpdateFunc: dc.updateReplicaSet,
                DeleteFunc: dc.deleteReplicaSet,
        })
        podInformer.Informer().AddEventHandler(cache.ResourceEventHandlerFuncs{
                DeleteFunc: dc.deletePod,
        })

Since Kubernetes uses asynchronous programming, the events are processed through work queues and workers.

func (dc *DeploymentController) addDeployment(obj interface{}) {
        d := obj.(*apps.Deployment)
        klog.V(4).Infof("Adding deployment %s", d.Name)
        dc.enqueueDeployment(d)
}

The items from the queue are handled by “syncDeployment” handler. Some of the functions done by the handler are shown below.

// List ReplicaSets owned by this Deployment, while reconciling ControllerRef
        // through adoption/orphaning.
        rsList, err := dc.getReplicaSetsForDeployment(d)
	
	// List all Pods owned by this Deployment, grouped by their ReplicaSet.
        // Current uses of the podMap are:
        //
        // * check if a Pod is labeled correctly with the pod-template-hash label.
        // * check that no old Pods are running in the middle of Recreate Deployments.
        podMap, err := dc.getPodMapForDeployment(d, rsList)

	// Update deployment conditions with an Unknown condition when pausing/resuming
        // a deployment. In this way, we can be sure that we won't timeout when a user
        // resumes a Deployment with a set progressDeadlineSeconds.
        if err = dc.checkPausedConditions(d); err != nil {
                return err
        }

	// rollback is not re-entrant in case the underlying replica sets are updated with a new
        // revision so we should ensure that we won't proceed to update replica sets until we
        // make sure that the deployment has cleaned up its rollback spec in subsequent enqueues.
        if getRollbackTo(d) != nil {
                return dc.rollback(d, rsList)
        }

        scalingEvent, err := dc.isScalingEvent(d, rsList)
        if err != nil {
                return err
        }
        if scalingEvent {
                return dc.sync(d, rsList)
        }

        switch d.Spec.Strategy.Type {
        case apps.RecreateDeploymentStrategyType:
                return dc.rolloutRecreate(d, rsList, podMap)
        case apps.RollingUpdateDeploymentStrategyType:
                return dc.rolloutRolling(d, rsList)
        }

Sync is responsible for reconciling deployments on scaling events or when they are paused.

func (dc *DeploymentController) sync(d *apps.Deployment, rsList []*apps.ReplicaSet) error {
        newRS, oldRSs, err := dc.getAllReplicaSetsAndSyncRevision(d, rsList, false)
        if err != nil {
                return err
        }
        if err := dc.scale(d, newRS, oldRSs); err != nil {
                // If we get an error while trying to scale, the deployment will be requeued
                // so we can abort this resync
                return err
        }

        // Clean up the deployment when it's paused and no rollback is in flight.
        if d.Spec.Paused && getRollbackTo(d) == nil {
                if err := dc.cleanupDeployment(oldRSs, d); err != nil {
                        return err
                }
        }

        allRSs := append(oldRSs, newRS)
        return dc.syncDeploymentStatus(allRSs, newRS, d)
}


// scale scales proportionally in order to mitigate risk. Otherwise, scaling up can increase the size
// of the new replica set and scaling down can decrease the sizes of the old ones, both of which would
// have the effect of hastening the rollout progress, which could produce a higher proportion of unavailable
// replicas in the event of a problem with the rolled out a template. Should run only on scaling events or
// when a deployment is paused and not during the normal rollout process.

func (dc *DeploymentController) scale(deployment *apps.Deployment, newRS 
*apps.ReplicaSet, oldRSs []*apps.ReplicaSet) error {

 	// If there is only one active replica set then we should scale that up to the full count of the
        // deployment. If there is no active replica set, then we should scale up the newest replica set.
        if activeOrLatest := deploymentutil.FindActiveOrLatest(newRS, oldRSs); activeOrLatest != nil {


	// If the new replica set is saturated, old replica sets should be fully scaled down.
        // This case handles replica set adoption during a saturated new replica set.
        if deploymentutil.IsSaturated(deployment, newRS) {

 // There are old replica sets with pods, and the new replica set is not saturated. 
        // We need to proportionally scale all replica sets (new and old) in case of a
        // rolling deployment.
        if deploymentutil.IsRollingUpdate(deployment) {

		// Number of additional replicas that can be either added or removed from the total
                // replicas count. These replicas should be distributed proportionally to the active
                // replica sets.
                deploymentReplicasToAdd := allowedSize - allRSsReplicas

                // The additional replicas should be distributed proportionally amongst the active
                // replica sets from the larger to the smaller in size replica set. Scaling direction
                // drives what happens in case we are trying to scale replica sets of the same size.
                // In such a case when scaling up, we should scale up newer replica sets first, and
                // when scaling down, we should scale down older replica sets first.

We hope this article helped you understand the inner workings of Kubernetes deployment controller. If you would like to learn more about Kubernetes and get certified, join our 2-day Kubernetes workshop.

Welcome to Cloud View!

The last couple of weeks we have been curating predictions for the coming year (and decade) from well regarded sources. Now it’s time to drill down deeper into specific areas and find out what experts in the field see in store for the future.

Predictions

Cybersecurity: Mitigating cyber-attacks and risks

Forbes has put out a really exhaustive list of predictions (141 to be exact!) in the cyber security realm. These are all from key players and professionals in the digital arena – CIOs, CEOs, CFOs and security heads from across the digital spectrum weigh in with what they think is crucial to mitigate cyber attacks and risks in the coming few years.

Future of DevOps

DevOps is all about bringing the power of collaboration to executing business ideas; turning organizational visions into applications that drive growth and profits. So what does the future hold for the DevOps community?


Industry Speak

AT&T integrating 5G with Microsoft cloud

5G has been in news for all the wrong reasons, but finally we see some interesting news emerging from the industry. A strategic partnership between Microsoft and AT&T announces that AT&T’s 5G core will run on Azure!


Kubernetes for exponential growth

Containers and container orchestration with Kubernetes are vital for any tech-based business looking to deliver more features – faster and more affordably. Here is a look at how AlphaSense, one of the top AI start-ups leveraged Kubernetes to accelerate growth.


From CloudIQ

Optimizing Azure Cosmos DB Performance

Azure Cosmos DB allows Azure platform users to elastically and independently scale throughput and storage across any number of Azure regions worldwide. Here is an article on how to optimize Cosmos DB performance.

How to Debug and Troubleshoot Common Problems in Kubernetes Deployments

Kubernetes deployment issues are not always easy to troubleshoot. In some cases, the errors can be resolved easily, and, in some cases, detecting errors requires us to dig deeper and run various commands to identify and resolve the issues. Here is a guided tutorial to debug applications that are deployed into Kubernetes.

Welcome to Cloud View!

With the new year 2020 coming closer, the industry is firmly looking towards the future. Cloud news is full of predictions for the year ahead and here’s a quick selection we picked for this week’s reading.

Predictions

Forrester’s cloud computing predictions for 2020

Forrester has an excellent track record of predicting the right cloud trends. And that makes their 2020 cloud computing predictions a MUST-READ. Here is a breakdown from TechRepublic.

Gartner’s top strategic predictions for 2020 and beyond

Last week we put the spotlight on Granter’s strategic trends, this week we delve further into these to understand how they would affect the people and their lives and work. Unsurprisingly, AI takes center-stage again.


Industry Insights

Telcos embrace containers

Gartner predicts that over 75% of global companies will run containerized applications by 2022. Kubernetes is the leading container orchestration platform for managing these containers. Here is a look at how Telcos are planning to use it to deploy cloud-native 5G networks.


AWS IoT Day – Eight Powerful New Features

AWS regularly puts out bundled themed announcements, which make it easy for us to find relevant information in one place. Here is the one related to AWS IoT Day. Check out 8 powerful AWS features, from secret tunneling to Alexa voice service integration and more.


Interesting announcements from KubeCon

Over 100 announcements were made at KubeCon, here’s a quick read of the 10 most important ones.


This week at CloudIQ

Kubernetes on Azure: A 2-day workshop for AKS developers

Container technology has revolutionized the DevOps landscape and offers organizations the chance to develop and test applications faster and more cost-effectively. CloudIQ’s 2-day hands-on workshop is designed to give DevOps team members the opportunity to skill-up and learn Kubernetes design, deployment, and management.

Configuring Palo Alto Networks Next-Generation Firewall (NGFW) – A Detailed Guide

Today organizations require an enterprise cyber-security platform, which provides network security, cloud security, endpoint protection, & various related cloud-delivered security services. Palo Alto Networks Next-Generation Firewall (NGFW) fits the bill and here is a detailed guide on configuring it.

End-to-end front-end testing has always been a bit of a pain for developers. Testing is one of the critical final steps of any development project, however web testing has tested the patience of all developers at some time or another. The modern web testing ecosystem comes with its own set of challenges – from data security to additional time and expense to managing the dynamic behavior of the contemporary development frameworks. Hence, the need to bring automation to the testing process!

Benefits of Automation Testing
  • Automation increases the speed of test execution
  • Automation helps increase Test Coverage
  • One can do automation testing at the time of regression work
  • Automation testing works when GUI is the same, but you will have a lot of functional changes

When to use Automation Testing?

Here are some scenarios where Automation testing is highly recommended

  • Requirements do not change frequently
  • Access to the application for load and performance with many virtual users
  • Steady software with respect to manual testing
  • Obtainability of time
  • Huge and business-critical projects
  • Projects that need to test the same areas often
Automation testing step by step

There are lots of helpful tools to write automation scripts, however, before using those tools it’s important to identify the process for test automation.

  • Identify areas within the software to automate
  • Choose the appropriate tool for test automation
  • Write test scripts
  • Develop test suits
  • Execute test scripts
  • Build result reports
  • Find possible bugs or performance issue
List of automation tools:

Automation of testing frameworks helps us to improve the quality, speed, and accuracy of the testing processes. Here is a list of automation tools,

  • Cypress
  • Selenium
  • Protractor
  • Appium(Mobile)
Why choose Cypress:

Cypress solves many of the main testing bottlenecks developers face regularly. It is a JavaScript-based end-to-end testing framework that doesn’t use Selenium (most widely used for testing) at all. It is built on top of Mocha and Mocha’s features is the javascript test framework running on the browser, which makes asynchronous testing simple. Cypress automatically waits for loading DOM element, elements to be visible, AJAX calls to be finished, etc. Hence, we don’t need to use implicit and explicit waits.

Another advantage Cypress offers to developers is that it runs directly in the browser with no network communication. The architecture makes testing and development happen simultaneously. It allows developers access to tools, and they can make changes and see them reflected in real-time. Naturally, this lends more precision and speed to the whole process.

Features of Cypress:
  • Time travel: Cypress takes snapshots as your test runs.
  • Debuggability: Cypress can guess why a test case has failed.
  • Automatic waiting: There is no need to use wait or sleep because it automatically waits for your commands.
  • Spies, stubs, and clocks: verify and control the behavior of functions, server response, and timer.
  • Screenshot and video: Cypress testing automatically takes screenshots when your test case fails and makes a video of the complete result when it is run from the CLI.
Features of Mocha:

Mocha provides the below benefits,

  • Browser support
  • Async & promises support
  • Test coverage reporting
Advantage of Cypress:
  • Open-source
  • It has Promise Support
  • Java script testing framework
  • Easy and reliable testing
  • Fast, free and open-source
  • Easy to control our response, headers, and status.
  • Helps you in finding the locator
Installing Cypress:

Installing Cypress is an easy task compared to a Selenium installation. There are two commands used to install Cypress on machines. These are,

  1. npm init
  2. npm install Cypress

The first command is used to create a “package.json” file, and the second command is used to install all Cypress dependencies.

Project Folder Structure Details:

Project folder structure details as below,

*node_modules folder – It is the directory for build tools.

*package.json file – It is the file in the app root, which defines where libraries will be installed into node_modules when you run “npm install”.

*cypress folder – It contains folder like fixtures, integration, plugins, screenshot, support, and video. These folder features are below,

a. Fixture – This folder is used as external pieces of static data and can be used for your test.

b. Integration – This folder is used to write the testcase of your app.

 c. Screenshot – This folder is used to store screenshots of your test.

 d. Video – It is used to store videos in your test.

 e. Support – This folder is used to write the common commands file.

Write your sample program in Cypress:

Step 1: Open your visual studio code in your machine

Step 2: Create a new Cypress project folder and name it as “cypresse2e”

Step 3: Open the command line and go to the above-created project path.

Step 4: Type first command under “Installing cypress” heading then wait for it to create package.json file.

Step 5: After that, type the second command

Step 6: The above task will finish within 2-3 minutes after creating the Cypress and node_modules folder inside the “cypresse2e” folder. This folder will also contain a “json” file.

Step 7: Click the Cypress folder under “cypresse2e” in vs code.

Step 8: Automation page details are as below,

We will use the CloudIQ home page link for this automation

Step 9: Create the “cypressAutomation.spec.ts” file under integration folder and write the program as seen below in the screenshot,

Program Explanation:

Here is what the test script given above does.

  • Navigate to “cloudiqtech” site.
  • Wait for 10 seconds for the page to load
  • Next click on “AWS” ref link
  • Then navigate to “AWS” page
  • Finally, validate the current page as the AWS page.

Step 10: Open the command terminal then go to your Cypress project path then run the below-mentioned command,

Step 11: After waiting for 1-2 minutes, it will open the Cypress Terminal app, as shown below in the screenshot. It contains all the tests – like the ones you wrote in your automation test and default tests.

Step 12: Click your “cypressAutomation.spec.js” file, this automatically opens the default chrome browser to run your test and makes a test coverage report in your browser like below,

Test Result:

*Three tests are passed successfully.

*No tests failed here.

*In total, these three tests ran within 30.44 seconds.

*Screenshots were automatically taken during your tests. If you hover above the testcase in the test, it will display the screenshot image for every separate testcase.

Welcome to Cloud View!

The cloud computing landscape is evolving, innovating, and expanding almost at the speed of thought! Every week there are hundreds of announcements, insights, opinions, and studies discussing new trends, ideas, and technological breakthroughs. To help you get the most relevant information, we have put together a weekly curated list of must-read articles under Cloud View.

Predictions

Cloud computing in 2020

As 2019 winds down, our eyes turn to the next year. Let’s find out what the pundits are predicting for cloud computing in 2020.
Hybrid cloud is passé as onmi-cloud becomes the preferred enterprise approach; Kubernetes is all set to become the dominant talking point in tech conversations in 2020, and AI will become omnipresent.

How will upcoming technology affect humans – at work or at home?

How can IT leaders invest in today’s technology for a future payoff? Here are the top 10 strategic tech trends by Gartner – a look into the future. A must-read before planning for the coming year.

AI and Robotics

No talk of the future is complete without AI and robotics! Keith Shaw, editor-in-chief of Robotics Business Review, shares some insights on where robotics is heading.

Industry Insights

Microsoft and Google Cloud’s battle for the enterprise

Google Cloud is aggressively trying to elbow into the cloud market, leading to a battle royale. All the cloud giants have deep pockets and are not afraid of investing BIG, which is great for innovation and enterprise clients. Check out the whole article for a more in-depth look at how the race for cloud dominance is playing out.  

Confidential Computing

Microsoft brings confidential computing capability to Kubernetes workloads – an additional layer of security to keep business data safe.

This week at CloudIQ

 

CloudIQ Technologies is now a Kubernetes Certified Service Provider (KCSP)

Cloud Native Computing Foundation (CNCF) recognizes CloudIQ as one of the few (122 as on Nov 19, 2019) service providers worldwide to receive KCSP certification. As a KCSP, CloudIQ will be able to access collaborative group support to help its clients develop and deploy cloud native applications quickly and efficiently. The CNCF partnership program also puts CloudIQ in touch with organizations looking to design, adopt and implement cloud native solutions.

 

Understanding Kubernetes Concepts – A QuickStart Guide 

Container technology made software development more agile, however, containers need to be tracked, monitored, and managed, which is where container orchestration and Kubernetes come in. Here is a quick start guide to understanding Kubernetes concepts.

At a Glance

Seattle, WA, November 21, 2019 – CloudIQ Technologies, a fast-growing premier cloud consulting and solutions provider, announced today its status as a Kubernetes Certified Service Provider (KCSP).

Cloud Native Computing Foundation(CNCF) is a non-profit member of the Linux Foundation that promotes cloud-native computing and the certification body for Kubernetes. CNCF recognizes CloudIQ as one of the few (121 as on date) service providers worldwide to receive KCSP certification.

“We are so thrilled to welcome CloudIQ Technologies into the CNCF family,” said Dan Kohn, Executive Director of the Cloud Native Computing Foundation. “As part of our select group of Kubernetes Certified Services Providers (KCSPs), CloudIQ will be an integral part of our Kubernetes platform outreach and will be available to help organizations successfully adopt Kubernetes.”

As a KCSP, CloudIQ will be able to access collaborative group support to help its clients develop and deploy cloud native applications quickly and efficiently. The CNCF partnership program also puts CloudIQ in touch with organizations looking to design, adopt and implement cloud native solutions.

CloudIQ’s technology experts, who have more than a decade of varied technical and business experience, deliver the full range of cloud-native / open source solutions to its clients through its team of Azure & AWS certified architects, Certified Kubernetes Administrators (CKA), Certified Kubernetes Developers, and DevSecOps professionals.

“We have been extremely happy to work closely with the CNCF community to push the boundaries of the Kubernetes platform further and pass on the benefits of cloud-native technologies to our clients,” said Mr. Prem Kandalu, CEO. “As a part of the Cloud Native Computing Foundation, we will continue to develop new capabilities and strengthen our cloud strategy and business. We hope our contribution to the pooled expertise at CNCF delivers greater speed, scale, economic advantages to developers across the board and leads to the growth of the entire K8s community.”

About CloudIQ Tech:

CloudIQ is a leading cloud consulting and solutions firm with deep industry expertise in building cloud-native solutions that help customers realize the cost, scale, and security benefits of the cloud. As strategic advisors to several Fortune 500 companies, CloudIQ provides operational strategies, monitoring & assistance with command centers, and application transformation guidance for comprehensive cloud migrations.

For more information visit our site https://www.cloudiqtech.com
Or you are welcome to contact us by calling us at +1 (206) 203-4151
Or mailing at [email protected]

Cosmos DB is Microsoft Azure’s hugely successful tool to help their clients manage data on a global scale. This multi-model database service allows Azure platform users to elastically and independently scale throughput and storage across any number of Azure regions worldwide.

As Cosmos DB supports multiple data models, you can take advantage of fast, single-digit-millisecond data access using any of your favorite APIs, including SQL, MongoDB, Cassandra, Tables, or Gremlin. Being a NoSQL database, anyone with experience in MongoDB can easily work with Cosmos DB. Meanwhile, by supporting SQL APIs, it makes it easy to interact with SQL knowledge.

Why Cosmos DB?

For organizations looking to build a flexible and scalable database that is globally distributed, Cosmos DB is especially useful as it

  • provides a ready-to-use, extremely dynamic database service
  • guarantees low latency of less than 10 milliseconds when reading data and less than 15 milliseconds when writing data.
  • offers customers a faster, completely seamless experience.
  • offers 99.99% availability

Here are some tried and tested tips from our senior Azure expert on how to get the most out of Cosmos DB.

Data Modeling

Cosmos DB is great because it lets you model semi/unstructured aggregates and dynamic entities. This means it’s very easy to model ever-changing entities, entities that don’t all share the same attributes and hierarchical aggregates. To model for Cosmos, you need to think in terms of hierarchy and aggregates instead of entities and relations. NoSQL lets you, say, store a thing that has other things, which have things of their own. Give me the whole hierarchy of things back. So, you don’t have a person, rental addresses, and a relation between them. Instead, you have rental records, which aggregate for each person what rental addresses they’ve had.

The following NO SQL rules will perfectly match cosmos DB too.

  • Should be used as a complement to an existing or additional database.
  • Deal with PACELC theorem, an extension of CAP theorem.
  • A data modeler should think in terms of queries instead of in terms of storage
Connection Types

Cosmos DB can be connected to the application by 2 modes.

  • Gateway mode
  • Direct mode

The gateway mode is the default mode in Microsoft.Azure.DocumentDB SDK. It uses HTTPS port with a single endpoint. While the direct mode is the default for .NET V3 SDK. This uses TCP and HTTPS for connectivity.

The gateway mode is better while your application runs within a corporate network with strict network rules because it has a single endpoint that can be configured to the firewall for security. Meanwhile, the gateway mode performance will be low when compared to the direct mode.

There is also an option to connect through a RESTful programming model provided by the SDK. All the CRUD can be done through REST calls. This method is recommended if you need a client App to do database access directly instead of providing API. Thus, the overhead of providing an API wrapper to consume the Cosmos DB can be eradicated, and a performance payoff can be prevented.

The recommended mode is always the direct mode in most of the scenarios, which provides better performance.

I am taking the popular volcano data for comparing the response time between the SDK and RESTful model.

Query Executed in both the versions.

SELECT * FROM c where c.Status="Holocene"
Response details of the SDK

The query was responded with the data in 3710 ms.

Response details of the RESTful model

The query was responded with data in 5810 ms.

If we developed an API with this mode, then the response time taken by our API, too needs to be considered. So, using RESTful model in API will be a trade-off with the performance. Use this mode to query from the client directly.

Partitioning the DB

The logical partition is the primary key to provide performance to the cosmos transactions. For example, if you have a database with a list of student data of around 1500 from a school. Now, a simple search for a student named “Peter” will lead to a search through all the 1500 data entries, which consumes high throughput to get the result. Now split the data logically by the “Grade” they belong to. Now, querying like a student named “Peter” from “Grade 5” will lead the system to search only 30 or 40 students from the total 1500 saving the throughput consumption and elevating the performance as compared to the earlier approach.

The common phenomenon will be a. Any key such as city, state, country kind of properties can be used as Partition Key.

a. Any key such as city, state, country kind of properties can be used as Partition Key.
 b. No partition is required up to 10 GB.
 c. The query must be provided with the partition key to be searched.
 d. The property selected as partition key must be in all the documents in the container.

I am taking the popular Volcano data for testing the performance with and without Partition Key.

1. I initially created a collection without a partition key. The performance for the given query is

SELECT * FROM c where c.Status="Holocene"
Resultset
MetricValue
Partition key range id0
Retrieved document count200
Retrieved document size (in bytes)100769
Output document count200
Output document size (in bytes)101069
Index hit document count200
Index lookup time (ms)0.21
Document load time (ms)1.29
Query engine execution time (ms)0.33
System function execution time (ms)0
User defined function execution time (ms)0
Document write time (ms)0.52

2. Then I recreated the same collection with “/country” as a partition. Now the same query with Japan as partition value results with the given values.

Resultset
MetricValue
Partition key range id0
Retrieved document count16
Retrieved document size (in bytes)7887
Output document count16
Output document size (in bytes)7952
Index hit document count16
Index lookup time (ms)0.23
Document load time (ms)0.17
Query engine execution time (ms)0.06
System function execution time (ms)0
User defined function execution time (ms)0
Document write time (ms)0.01
Tune the Index

Indexing is always a top priority item in the checklist to tune performance. Indexing is an internal job that keeps track of the metadata about the data, which helps in finding the result set data for a query. By default, all the properties of a Cosmos container will be indexed. But it is not necessary as it is a useless overhead to the DB, and also keeping track of a lot of data consumes enough RUs, which is not cost-effective. The better approach is to exclude all the paths from indexing and add only age paths that are used for querying in the application. 

Indexing ModeWith Default IndexingWith Custom Indexing
RU’s Consumed3146.1519.83
Output Doc Count100100
Doc load time (In ms)646.512.32
Query engine execution time (In ms)434.264.96
System function execution time (In ms)57.032.41
Paging

The execution of the query, by default, will return 100 documents. We can increase the number of documents by providing “maxItemCount” value. The maximum size will be 1000 documents. But it is not meaningful to take 1000 documents at a time from the DB except in some scenarios. To improve the performance and to show a crisp result set to the user, always reduce the “maxItemCount”. Unlike SQL databases, pagination is the default behavior in Cosmos. So, even though you are going to provide maximum count as 1000 and the result for the query is going to be more than that, then the Cosmos is going to return a token named “Continuation Token”. This token consists of a unique value that points to the query we did and the page number. If the total result for the query is provided by the Cosmos, then we can do the logic of pagination at the front end. Thus, by reducing the number of documents per response, we can save the throughput consumption, network data transaction, and increase performance.

Throughput Management

RU ’s or Request Units is the common term we always come across when using Cosmos DB. When you read a document from a container or write a document to it, then you are trading an RU with the Cosmos for your operation. It is like currency in our common world. Without money, you can’t buy anything, and without RU, you can’t query anything. You can buy only the items that cost equivalent to or less than the money you have in hand. Similarly, you can query the data that equals the RU’s you have.

If you have large data and if the query needs to traverse deep into the collection, then you need enough RU’s. Adding a property in the index will consume some RU’s. So, if all the properties are indexed, then your RU’s pay off will be high, and you will lack RU’s for querying. So, always add in the index only the properties that are needed while querying.

Index properly, save the RU’s, and utilize it during querying. For example, if you have 100K documents in the container. The Cosmos DB is consuming 1000 RU’s to do a query operation up to 50K documents with the default indexing, then our query will not reach the rest of the 50K documents, and we will never receive them in our result set. If appropriately indexed, the same query will consume only 400 RU’s to penetrate all the 100K documents.

Startup latency

The very first query will always be a bit late because of the time it takes to awaken the connection. To overcome this latency, it is best practice to call “OpenAsync()” in SDK 2 in the beginning while creating the connection.

await client.OpenAsync();
Singleton Connection

The best approach is to connect the DB and keep the connection alive for all the instances of the application. Also, polling the DB within a period will keep the connection alive. This reduces the DB connectivity latency.

Regions

Make sure the Cosmos and the applications are grouped within the same Azure Region. This reduces the latency a lot. The lowest possible latency is achieved by ensuring the calling application is located within the same Azure region as the provisioned Azure Cosmos DB endpoint.

Programming Best Practices
  • Always use the latest SDK version.
  • Use Streaming API (in SDK 3) that can receive and return data without serializing. Helpful when your API is just a relay and not doing any logical operations on the data.
  • Tune the queries.
  • Implement retry logic with reasonable waiting time to prevent throttle during a busy time.

By carefully analyzing all the above factors, we can improve the Cosmos DB query performance substantially.

Amazon Web Services (AWS) is a secure cloud services platform, offering compute power, database storage, content delivery, and other functionalities to help businesses scale and grow.

It gives organizations a secure and robust platform to develop their custom cloud-based solutions and has several unique features that make it one of the most reliable and flexible cloud platform such as

  • Mobile-friendly access through AWS Mobile Hub and AWS Mobile SDK
  • Fully managed purpose-built Databases
  • Serverless cloud functions
  • Range of storage options that are affordable and scalable.
  • Unbeatable security and compliance

Following are some core services offered by AWS:

AWS Core services
  1. An EC2 instance is a virtual server in Amazon’s Elastic Compute Cloud (EC2) for running applications on the AWS infrastructure.
  2. Amazon Elastic Block Store (EBS) is a cloud-based block storage system provided by AWS that is best used for storing persistent data.
  3. Amazon Virtual Private Cloud (Amazon VPC) enables us to launch AWS resources into a virtual network that we have defined. This virtual network closely resembles a traditional network that we would operate in our own data center, with the benefits of using the scalable infrastructure of AWS.
  4. Amazon S3 or Amazon Simple Storage Service is a service offered by Amazon Web Services that provides object storage through a web service interface. Amazon S3 uses the same scalable storage infrastructure that Amazon.com uses to run its global e-commerce network.
  5. AWS security groups (SGs) are associated with EC2 instances and provide security at the protocol and port access level. Each security group — working much the same way as a firewall — contains a set of rules that filter traffic coming into and out of an EC2 instance.

Let us look more deeply at one of AWS’s core services – AWS CloudFormation – that is key for managing workloads on AWS.

1.   CloudFormation

AWS CloudFormation is a service that helps us model and set up our Amazon Web Services resources so that we can spend less time managing those resources and more time focusing on our applications that run in AWS.  We create a template that describes all the AWS resources that we want (like Amazon EC2 instances or S3 buckets), and AWS CloudFormation takes care of provisioning and configuring those resources for us. We don’t need to individually create and configure AWS resources and figure out what’s dependent on what; AWS CloudFormation handles all of that.

A stack is a collection of AWS resources that you can manage as a single unit. In other words, we can create, update, or delete a collection of resources by creating, updating, or deleting stacks. All the resources in a stack are defined by the stack’s AWS CloudFormation template.

2.   CloudFormation template

CloudFormation templates can be written in either JSON or YAML.  The structure of the template in YAML is given below:

---
AWSTemplateFormatVersion: "version date"

Description:
  String
Metadata:
  template metadata
Parameters:
  set of parameters
Mappings:
  set of mappings
Conditions:
  set of conditions
Resources:
  set of resources
Outputs:
  set of outputs

In the above yaml file,

  1. AWSTemplateFormatVersion – The AWS CloudFormation template version that the template conforms to.
  2. Description – A text string that describes the template.
  3. Metadata – Objects that provide additional information about the template.
  4. Parameters – Values to pass to our template at runtime (when we create or update a stack). We can refer to parameters from the Resources and Outputs sections of the template.
  5. Mappings – A mapping of keys and associated values that we can use to specify conditional parameter values, like a lookup table. We can match a key to a corresponding value by using the Fn::FindInMap intrinsic function in the Resources and Outputs sections.
  6. Conditions – Conditions that control whether certain resources are created or whether certain resource properties are assigned a value during stack creation or update. For example, we can conditionally create a resource that depends on whether the stack is for a production or test environment.
  7. Resources – Specifies the stack resources and their properties, such as an Amazon Elastic Compute Cloud instance or an Amazon Simple Storage Service bucket.  We can refer to resources in the Resources and Outputs sections of the template.
  8. Outputs – Describes the values that are returned whenever we view our stack’s properties. For example, we can declare an output for an S3 bucket name and then call the AWS cloudformation describe-stacks AWS CLI command to view the name.

Resources is the only required section in the CloudFormation template.  All other sections are optional.

3.   CloudFormation template to create S3 bucket

S3template.yml

Resources:
  HelloBucket:
    Type: AWS::S3::Bucket

In AWS Console, go to CloudFormation and click on Create Stack

Upload the template file which we created.  This will get stored in an S3 location, as shown below.

Click next and give a stack name

Click Next and then “Create stack”.  After a few minutes, you can see that the stack creation is completed.

Clicking on the Resource tab, you can see that the S3 bucket has been created with name “s3-stack-hellobucket-buhpx7oucrgn”.  AWS has provided this same since we didn’t specify the BucketName property in YAML.

Note that deleting the stack will delete the S3 bucket which it had created.

4.   Intrinsic functions

AWS CloudFormation provides several built-in functions that help you manage your stacks.

In the below example, we create two resources – a Security Group and an EC2 Instance, which uses this Security Group.  We can refer to the Security Group resource using the !Ref function.

Ec2template.yml

Resources:
  Ec2Instance:
    Type: 'AWS::EC2::Instance'
    Properties:
      SecurityGroups:
        - !Ref InstanceSecurityGroup
      KeyName: mykey
      ImageId: ''
  InstanceSecurityGroup:
    Type: 'AWS::EC2::SecurityGroup'
    Properties:
      GroupDescription: Enable SSH access via port 22
      SecurityGroupIngress:
        - IpProtocol: tcp
          FromPort: '22'
          ToPort: '22'
          CidrIp: 0.0.0.0/0

Some other commonly used intrinsic functions are

  1. Fn::GetAtt – returns the value of an attribute from a resource in the template.
  2. Fn::Join – appends a set of values into a single value, separated by the specified delimiter. If a delimiter is an empty string, the set of values are concatenated with no delimiter.
  3. Fn::Sub – substitutes variables in an input string with values that you specify. In our templates, we can use this function to construct commands or outputs that include values that aren’t available until we create or update a stack.
5.   Parameters

Parameters enable us to input custom values to your template each time you create or update a stack.

TemplateWithParameters.yaml

Parameters: 
  InstanceTypeParameter: 
    Type: String
    Default: t2.micro
    AllowedValues: 
      - t2.micro
      - m1.small
      - m1.large
    Description: Enter t2.micro, m1.small, or m1.large. Default is t2.micro.
Resources:
  Ec2Instance:
    Type: AWS::EC2::Instance
    Properties:
      InstanceType:
        Ref: InstanceTypeParameter
      ImageId: ami-0ff8a91507f77f867
6.   Pseudo Parameters

Pseudo parameters are parameters that are predefined by AWS CloudFormation. We do not declare them in our template. Use them the same way as we would a parameter as the argument for the Ref function.

Commonly used pseudo parameters:

  1. AWS: Region – Returns a string representing the AWS Region in which the encompassing resource is being created, such as us-west-2
  2. AWS::StackName – Returns the name of the stack as specified during cloudformation create-stack, such as teststack
7.   Mappings

The optional Mappings section matches a key to a corresponding set of named values. For example, if you want to set values based on a region, we can create a mapping that uses the region name as a key and contains the values we want to specify for each specific region. We use the Fn::FindInMap intrinsic function to retrieve values in a map.

We cannot include parameters, pseudo parameters, or intrinsic functions in the Mappings section.

TemplateWithMappings.yaml

AWSTemplateFormatVersion: "2010-09-09"
Mappings: 
  RegionMap: 
    us-east-1:
      HVM64: ami-0ff8a91507f77f867
      HVMG2: ami-0a584ac55a7631c0c
    us-west-1:
      HVM64: ami-0bdb828fd58c52235
      HVMG2: ami-066ee5fd4a9ef77f1
    eu-west-1:
      HVM64: ami-047bb4163c506cd98
      HVMG2: ami-0a7c483d527806435
    ap-northeast-1:
      HVM64: ami-06cd52961ce9f0d85
HVMG2: ami-053cdd503598e4a9d
    ap-southeast-1:
      HVM64: ami-08569b978cc4dfa10
      HVMG2: ami-0be9df32ae9f92309
Resources: 
  myEC2Instance: 
    Type: "AWS::EC2::Instance"
    Properties: 
      ImageId: !FindInMap [RegionMap, !Ref "AWS::Region", HVM64]
      InstanceType: m1.small
8.   Outputs

The optional Outputs section declares output values that we can import into other stacks (to create cross-stack references), return in response (to describe stack calls), or view on the AWS CloudFormation console. For example, we can output the S3 bucket name for a stack to make the bucket easier to find.

In the below example, the output named StackVPC returns the ID of a VPC, and then exports the value for cross-stack referencing with the name VPCID appended to the stack’s name.

Outputs:
  StackVPC:
    Description: The ID of the VPC
    Value: !Ref MyVPC
    Export:
      Name: !Sub "${AWS::StackName}-VPCID"

As organizations start to create and maintain clusters in AKS (Azure Kubernetes Service), they also need to use cloud-based identity and access management service to access other Azure cloud resources and services. The Azure Active Directory (AAD) pod identity is a service that gives users this control by assigning identities to individual pods.  

Without these controls, accounts may get access to resources and services they don’t require. And it can also become hard for IT teams to track which set of credentials were used to make changes.

Azure AD Pod identity is just one small part of the container and Kubernetes management process and as you delve deeper, you will realize the true power that Kubernetes and Containers bring to your DevOps ecosystem.

Here is a more detailed look at how to use AAD pod identity for connecting pods in AKS cluster with Azure Key Vault.

Pod Identity

Integrate your key management system with Kubernetes using pod identity. Secrets, certificates, and keys in a key management system become a volume accessible to pods. The volume is mounted into the pod, and its data is available directly in the container file system for your application.

On an existing AKS cluster –

Deploy Key Vault FlexVolume to your AKS cluster with this command:

  • kubectl create -f https://raw.githubusercontent.com/Azure/kubernetes-keyvault-flexvol/master/deployment/kv-flexvol-installer.yaml
1. Create the Deployment

Run this command to create the aad-pod-identity deployment on an RBAC-enabled cluster:

  • kubectl apply -f https://raw.githubusercontent.com/Azure/aad-pod-identity/master/deploy/infra/deployment-rbac.yaml

Or run this command to deploy to a non-RBAC cluster:

  • kubectl apply -f https://raw.githubusercontent.com/Azure/aad-pod-identity/master/deploy/infra/deployment.yaml
2. Create an Azure Identity

Create azure managed identity

Command:- az identity create -g ResourceGroupNameOfAKsService -n aks-pod-identity(ManagedIdentity)

Output:-  

{
"clientId": "xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx ",
"clientSecretUrl": "https://control-westus.identity.azure.net/subscriptions/xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx/resourcegroups/aks_dev_rg_wu/providers/Microsoft.ManagedIdentity/userAssignedIdentities/aks-pod-identity/credentials?tid=xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx&oid=xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxx&