Logo
    Search

    About this Episode

    Data breaches occur more often than we’d like them to. As businesses embrace remote work practices, IT resources are more at risk than ever before. Oracle Identity and Access Management (IAM) is an essential tool for protecting enterprise resources against cybersecurity threats. Join Lois Houston and Nikita Abraham, along with special guest Rohit Rahi, as they examine IAM and the key aspects of this service, and discuss how you can control who has access to your resources.

    Oracle MyLearn: https://mylearn.oracle.com/
    Oracle University Learning Community: https://education.oracle.com/ou-community

    Twitter: https://twitter.com/Oracle_Edu
    LinkedIn: https://www.linkedin.com/showcase/oracle-university/

    Special thanks to Arijit Ghosh, Kiran BR, Rashmi Panda, David Wright, the OU Podcast Team, and the OU Studio Team for helping us create this episode.

    Recent Episodes from Oracle University Podcast

    OCI AI Services

    OCI AI Services

    Listen to Lois Houston and Nikita Abraham, along with Senior Principal Product Manager Wes Prichard, as they explore the five core components of OCI AI services: language, speech, vision, document understanding, and anomaly detection, to help you make better sense of all that unstructured data around you.

    Oracle MyLearn: https://mylearn.oracle.com/ou/learning-path/become-an-oci-ai-foundations-associate-2023/127177

    Oracle University Learning Community: https://education.oracle.com/ou-community

    LinkedIn: https://www.linkedin.com/showcase/oracle-university/

    X (formerly Twitter): https://twitter.com/Oracle_Edu

    Special thanks to Arijit Ghosh, David Wright, Himanshu Raj, and the OU Studio Team for helping us create this episode.

    --------------------------------------------------------

    Episode Transcript:

    00:00

    Welcome to the Oracle University Podcast, the first stop on your cloud journey. During this series of informative podcasts, we’ll bring you foundational training on the most popular
    Oracle technologies. Let’s get started!

    00:26

    Nikita: Welcome to the Oracle University Podcast! I’m Nikita Abraham, Principal Technical Editor with Oracle University, and with me is Lois Houston, Director of Innovation Programs.

    Lois: Hi there! In our last episode, we spoke about OCI AI Portfolio, including AI and ML services, and the OCI AI infrastructure.

    Nikita: Yeah, and in today’s episode, we’re going to continue down a similar path and take a closer look at OCI AI services.

    00:55

    Lois: With us today is Senior Principal Product Manager, Wes Prichard. Hi Wes! It’s lovely to have you here with us. Hemant gave us a broad overview of the various OCI AI services last week, but we’re really hoping to get into each of them with you. So, let’s jump right in and start with the OCI Language service. What can you tell us about it?

    Wes: OCI Language analyzes unstructured text for you. It provides models trained on industry data to perform language analysis with no data science experience needed. 

    01:27

    Nikita: What kind of big things can it do?

    Wes: It has five main capabilities. First, it detects the language of the text. It recognizes 75 languages, from Afrikaans to Welsh. 
    It identifies entities, things like names, places, dates, emails, currency, organizations, phone numbers--14 types in all. It identifies the sentiment of the text, and not just one sentiment for the entire block of text, but the different sentiments for different aspects. 

    01:56

    Nikita: What do you mean by that, Wes?

    Wes: So let's say you read a restaurant review that said, the food was great, but the service sucked. You'll get food with a positive sentiment and service with a negative sentiment. And it also analyzes the sentiment for every sentence. 

    Lois: Ah, that’s smart. Ok, so we covered three capabilities. What else?

    Wes: It identifies key phrases in the text that represent the important ideas or subjects. And it classifies the general topic of the text from a list of 600 categories and subcategories. 

    02:27

    Lois: Ok, and then there’s the OCI Speech service... 

    Wes: OCI Speech is very straightforward. It locks the data in audio tracks by converting speech to text. Developers can use Oracle's time-tested acoustic language models to provide highly accurate transcription for audio or video files across multiple languages. 

    OCI Speech automatically transcribes audio and video files into text using advanced deep learning techniques. There's no data science experience required. It processes data directly in object storage. And it generates timestamped, grammatically accurate transcriptions. 

    03:01

    Nikita: What are some of the main features of OCI Speech?

    Wes: OCI Speech supports multiple languages, specifically English, Spanish, and Portuguese, with more coming in the future. It has batching support where multiple files can be submitted with a single call. It has blazing fast processing. It can transcribe hours of audio in less than 10 minutes. It does this by chunking up your audio into smaller segments, and transcribing each segment, and then joining them all back together into a single file. It provides a confidence score, both per word and per transcription. It punctuates transcriptions to make the text more readable and to allow downstream systems to process the text with less friction. 

    And it has SRT file support. 

    03:45

    Lois: SRT? What’s that?

    Wes: SRT is the most popular closed caption output file format. And with this SRT support, users can add closed captions to their video. OCI Speech makes transcribed text more readable to resemble how humans write. This is called normalization. And the service will normalize things like addresses, times, numbers, URLs, and more. 

    It also does profanity filtering, where it can either remove, mask, or tag profanity and output text, where removing replaces the word with asterisks, and masking does the same thing, but it retains the first letter, and tagging will leave the word in place, but it provides tagging in the output data. 

    04:29

    Nikita: And what about OCI Vision? What are its capabilities?

    Wes: Vision is a computed vision service that works on images, and it provides two main capabilities-- image analysis and document AI. Image analysis analyzes photographic images. Object detection is the feature that detects objects inside an image using a bounding box and assigning a label to each object with an accuracy percentage. Object detection also locates and extracts text that appears in the scene, like on a sign. 
    Image classification will assign classification labels to the image by identifying the major features in the scene. One of the most powerful capabilities of image analysis is that, in addition to pretrained models, users can retrain the models with their own unique data to fit their specific needs. 

    05:20

    Lois: So object detection and image classification are features of image analysis. I think I got it! So then what’s document AI? 
    Wes: It's used for working with document images. You can use it to understand PDFs or document image types, like JPEG, PNG, and Tiff, or photographs containing textual information. 

    05:40

    Lois: And what are its most important features?

    Wes: The features of document AI are text recognition, also known as OCR or optical character recognition. 
    And this extracts text from images, including non-trivial scenarios, like handwritten texts, plus tilted, shaded, or rotated documents. Document classification classifies documents into 10 different types based on visual appearance, high-level features, and extracted keywords. This is useful when you need to process a document, based on its classification, like an invoice, a receipt, or a resume. 

    Language detection analyzes the visual features of text to determine the language rather than relying on the text itself. Table extraction identifies tables in docs and extracts their content in tabular form. Key value extraction finds values for 13 common fields and line items in receipts, things like merchant name and transaction date. 

    06:41

    Want to get the inside scoop on Oracle University? Head over to the Oracle University Learning Community. Attend exclusive events. Read up on the latest news. Get first-hand access to new products. Read the OU Learning Blog. Participate in Challenges. And stay up-to-date with upcoming certification opportunities.

    Visit mylearn.oracle.com to get started. 

    07:06

    Nikita: Welcome back! Wes, I want to ask you about OCI Anomaly Detection. We discussed it a bit last week and it seems like such an intelligent and efficient service.

    Wes: Oracle Cloud Infrastructure Anomaly Detection identifies anomalies in time series data. Equipment sensors generate time series data, but all kinds of business metrics are also time-based. The unique feature of this service is that it finds anomalies, not just in a single signal, but across many signals at once. That's important because machines often generate multiple signals at once and the signals are often related. 

    07:42

    Nikita: Ok you need to give us an example of this!

    Wes: Think of a pump that has an output pressure, a flow rate, an RPM, and an electrical current draw. When a pump's going to fail, anomalies may appear across several of those signals but at different times. OCI Anomaly Detection helps you to identify anomalies in a multivariate data set by taking advantage of the interrelationship among signals. 

    The service contains algorithms for both multi-signal, as in multivariate, single signal, as in univariate anomaly detection, and it automatically determines which algorithm to use based on the training data provided. The multivariate algorithm is called MSET-2, which stands for Multivariate State Estimation technique, and it's unique to Oracle. 

    08:28

    Lois: And the 2?

    Wes: The 2 in the name refers to the patented enhancements by Oracle labs that automatically identify and fix data quality issues resulting in fewer false alarms and more accurate results. 
    Now unlike some of the other AI services, OCI Anomaly Detection is always trained on the customer's data. It's trained using actual historical data with no anomalies, and there can be as many different trained models as needed for different sets of signals. 

    08:57

    Nikita: So where would one use a service like this?

    Wes: One of the most obvious applications of this service is for predictive maintenance. Early warning of a problem provides the opportunity to deploy maintenance resources and schedule downtime to minimize disruption to the business. 

    09:12

    Lois: How would you train an OCI Anomaly Detection model?

    Wes: It's a simple four-step process to prepare a model that can be used for anomaly detection. The first step is to obtain training data from the system to be monitored. The data must contain no anomalies and should cover the normal range of values that would be experienced in a full business cycle. 
    Second, the training data file is uploaded to an object storage bucket. 

    Third, a data set is created for the training data. So a data set in this context is an object in the OCI Anomaly Detection service to manage data used for training and testing models. 

    And fourth, the model is trained. A wizard in the user interface steps the user through the required inputs, such as the training data set and some training parameters like the target false alarm probability. 

    10:02

    Lois: How would this service know about the data and whether the trained model is univariate or multivariate?

    Wes: When training OCI Anomaly Detection models, the user does not need to specify whether the intended model is for multivariate or univariate data. It does this detection automatically. 

    For example, if a model is trained with 10 signals and 5 of those signals are determined to be correlated enough for multivariate anomaly detection, it will create an internal multivariate model for those signals. If the other five signals are not correlated with each other, it will create an internal univariate model for each one. 

    From the user's perspective, the result will be a single OCI anomaly detection model for the 10 signals. But internally, the signals are treated differently based on the training. A user can also train a model on a single signal and it will result in a univariate model. 

    10:55

    Lois: What does this OCI Anomaly Detection model training entail? How does it ensure that it does not have any false alarms?

    Wes: Training a model requires a single data file with no anomalies that should cover a complete business cycle, which means it should represent all the normal variations in the signal. During training, OCI Anomaly Detection will use a portion of the data for training and another portion for automated testing. The fraction used for each is specified when the model is trained. 
    When model training is complete, it's best practice to do another test of the model with a data set containing anomalies to see if the anomalies are detected and if there are any false alarms. Based on the outcome, the user may want to retrain the model and specify a different false alarm probability, also called F-A-P or FAP. The FAP is the probability that the model would produce a false alarm. The false alarm probability can be thought of as the sensitivity of the model. The lower the false alarm probability, the less likelihood of it reporting a false alarm, but the less sensitive it will be to detecting anomalies. Selecting the right FAP is a business decision based on the need for sensitive detections balanced by the ability to tolerate false alarms. 

    Once a model has been trained and the user is satisfied with its detection performance, it can then be used for inferencing. 

    12:23

    Nikita: Inferencing? Is that what I think it is? 

    Wes: New data is submitted to the model and OCI Anomaly Detection will respond with anomalies that are detected. The input data must contain the same signals that the model was trained on. So, for example, if the model was trained on signals A, B, C, and D, then for detection inferencing, the same four signals must be provided. No more, no less.

    12:46

    Lois: Where can I find the features of OCI Anomaly Detection that you mentioned? 

    Wes: The training and inferencing features of OCI Anomaly Detection can be accessed through the OCI console. However, a human-driven interface is not efficient for most business scenarios. 

    In most cases, automating the detection of anomalies through software is preferred to be able to process hundreds or thousands of signals using many trained models. The service provides multiple software interfaces for this purpose. 
    Each trained model is accessible through a REST API and an HTTP endpoint. Additionally, programming language-specific SDKs are available for multiple languages, including Python. Using the Python SDK, data scientists can work with OCI Anomaly Detection for both training and inferencing in an OCI Data Science notebook. 

    13:37

    Nikita: How can a data scientist take advantage of these capabilities? 

    Wes: Well, you can write code against the REST API or use any of the various language SDKs. But for data scientists working in OCI Data Science, it makes sense to use Python. 

    13:51

    Lois: That’s exciting! What does it take to use the Python SDK in a notebook… to be able to use the AI services?

    Wes: You can use a Notebook session in OCI Data Science to invoke the SDK for any of the AI services. 

    This might be useful to generate new features for a custom model or simply as a way to consume the service using a familiar Python interface. But before you can invoke the SDK, you have to prepare the data science notebook session by supplying it with an API Signing Key. 

    Signing Key is unique to a particular user and tenancy and authenticates that user to OCI when invoking the SDK. So therefore, you want to make sure you safeguard your Signing Key and never share it with another user. 

    14:34

    Nikita: And where would I get my API Signing Key?

    Wes: You can obtain an API Signing Key from your user profile in the OCI Console. Then you save that key as a file to your local machine. 

    The API Signing Key also provides commands to be added to a config file that the SDK expects to find in the environment, where the SDK code is executing. The config file then references the key file. Once these files are prepared on your local machine, you can upload them to the Notebook session, where you will execute SDK code for the AI service. 
    The API Signing Key and config file can be reused with any of your notebook sessions, and the same files also work for all of the AI services. So, the files only need to be created once for each user and tenancy combination. 

    15:27

    Lois: Thank you so much, Wes, for this really insightful discussion. To learn more about the topics covered today, you can visit mylearn.oracle.com and search for the Oracle Cloud Infrastructure AI Foundations course.

    Nikita: And remember, that course prepares you for the Oracle Cloud Infrastructure AI Foundations Associate certification that you can take for free! So, don’t wait too long to check it out. Join us next week for another episode of the Oracle University Podcast. Until then, this is Nikita Abraham…

    Lois Houston: And Lois Houston, signing off!

    16:03

    That’s all for this episode of the Oracle University Podcast. If you enjoyed listening, please click Subscribe to get all the latest episodes. We’d also love it if you would take a moment to rate and review us on your podcast app. See you again on the next episode of the Oracle University Podcast.

    The OCI AI Portfolio

    The OCI AI Portfolio

    Oracle has been actively focusing on bringing AI to the enterprise at every layer of its tech stack, be it SaaS apps, AI services, infrastructure, or data.

    In this episode, hosts Lois Houston and Nikita Abraham, along with senior instructors Hemant Gahankari and Himanshu Raj, discuss OCI AI and Machine Learning services. They also go over some key OCI Data Science concepts and responsible AI principles.

    Oracle MyLearn: https://mylearn.oracle.com/ou/learning-path/become-an-oci-ai-foundations-associate-2023/127177

    Oracle University Learning Community: https://education.oracle.com/ou-community

    LinkedIn: https://www.linkedin.com/showcase/oracle-university/

    X (formerly Twitter): https://twitter.com/Oracle_Edu

    Special thanks to Arijit Ghosh, David Wright, Himanshu Raj, and the OU Studio Team for helping us create this episode.

    -------------------------------------------------------

    Episode Transcript:

    00:00

    Welcome to the Oracle University Podcast, the first stop on your cloud journey. During this series of informative podcasts, we’ll bring you foundational training on the most popular
    Oracle technologies. Let’s get started!

    00:26

    Lois: Welcome to the Oracle University Podcast! I’m Lois Houston, Director of Innovation Programs with Oracle University, and with me is Nikita Abraham, Principal Technical Editor.

    Nikita: Hey everyone! In our last episode, we dove into Generative AI and Language Learning Models. 

    Lois: Yeah, that was an interesting one. But today, we’re going to discuss the AI and machine learning services offered by Oracle Cloud Infrastructure, and we’ll look at the OCI AI infrastructure.

    Nikita: I’m also going to try and squeeze in a couple of questions on a topic I’m really keen about, which is responsible AI. To take us through all of this, we have two of our colleagues, Hemant Gahankari and Himanshu Raj. Hemant is a Senior Principal OCI Instructor and Himanshu is a Senior Instructor on AI/ML. So, let’s get started!

    01:16

    Lois: Hi Hemant! We’re so excited to have you here! We know that Oracle has really been focusing on bringing AI to the enterprise at every layer of our stack. 

    Hemant: It all begins with data and infrastructure layers. OCI AI services consume data, and AI services, in turn, are consumed by applications. 

    This approach involves extensive investment from infrastructure to SaaS applications. Generative AI and massive scale models are the more recent steps. Oracle AI is the portfolio of cloud services for helping organizations use the data they may have for the business-specific uses. 

    Business applications consume AI and ML services. The foundation of AI services and ML services is data. AI services contain pre-built models for specific uses. Some of the AI services are pre-trained, and some can be additionally trained by the customer with their own data. 

    AI services can be consumed by calling the API for the service, passing in the data to be processed, and the service returns a result. There is no infrastructure to be managed for using AI services. 

    02:37

    Nikita: How do I access OCI AI services?

    Hemant: OCI AI services provide multiple methods for access. The most common method is the OCI Console. The OCI Console provides an easy to use, browser-based interface that enables access to notebook sessions and all the features of all the data science, as well as AI services. 

    The REST API provides access to service functionality but requires programming expertise. And API reference is provided in the product documentation. OCI also provides programming language SDKs for Java, Python, TypeScript, JavaScript, .Net, Go, and Ruby. The command line interface provides both quick access and full functionality without the need for scripting. 

    03:31

    Lois: Hemant, what are the types of OCI AI services that are available? 

    Hemant: OCI AI services is a collection of services with pre-built machine learning models that make it easier for developers to build a variety of business applications. The models can also be custom trained for more accurate business results. The different services provided are digital assistant, language, vision, speech, document understanding, anomaly detection. 

    04:03

    Lois: I know we’re going to talk about them in more detail in the next episode, but can you introduce us to OCI Language, Vision, and Speech?

    Hemant: OCI Language allows you to perform sophisticated text analysis at scale. Using the pre-trained and custom models, you can process unstructured text to extract insights without data science expertise. Pre-trained models include language detection, sentiment analysis, key phrase extraction, text classification, named entity recognition, and personal identifiable information detection. 

    Custom models can be trained for named entity recognition and text classification with domain-specific data sets. In text translation, natural machine translation is used to translate text across numerous languages. 

    Using OCI Vision, you can upload images to detect and classify objects in them. Pre-trained models and custom models are supported. In image analysis, pre-trained models perform object detection, image classification, and optical character recognition. In image analysis, custom models can perform custom object detection by detecting the location of custom objects in an image and providing a bounding box. 
    The OCI Speech service is used to convert media files to readable texts that's stored in JSON and SRT format. Speech enables you to easily convert media files containing human speech into highly exact text transcriptions. 

    05:52

    Nikita: That’s great. And what about document understanding and anomaly detection?

    Hemant: Using OCI document understanding, you can upload documents to detect and classify text and objects in them. You can process individual files or batches of documents. In OCR, document understanding can detect and recognize text in a document. In text extraction, document understanding provides the word level and line level text, and the bounding box, coordinates of where the text is found. 

    In key value extraction, document understanding extracts a predefined list of key value pairs of information from receipts, invoices, passports, and driver IDs. In table extraction, document understanding extracts content in tabular format, maintaining the row and column relationship of cells. In document classification, the document understanding classifies documents into different types. 

    The OCI Anomaly Detection service is a service that analyzes large volume of multivariate or univariate time series data. The Anomaly Detection service increases the reliability of businesses by monitoring their critical assets and detecting anomalies early with high precision. Anomaly Detection is the identification of rare items, events, or observations in data that differ significantly from the expectation. 

    07:34

    Nikita: Where is Anomaly Detection most useful?

    Hemant: The Anomaly Detection service is designed to help with analyzing large amounts of data and identifying the anomalies at the earliest possible time with maximum accuracy. Different sectors, such as utility, oil and gas, transportation, manufacturing, telecommunications, banking, and insurance use Anomaly Detection service for their day-to-day activities. 

    08:02

    Lois: Ok.. and the first OCI AI service you mentioned was digital assistant…

    Hemant: Oracle Digital Assistant is a platform that allows you to create and deploy digital assistants, which are AI driven interfaces that help users accomplish a variety of tasks with natural language conversations. When a user engages with the Digital Assistant, the Digital Assistant evaluates the user input and routes the conversation to and from the appropriate skills. 
    Digital Assistant greets the user upon access. Upon user requests, list what it can do and provide entry points into the given skills. It routes explicit user requests to the appropriate skills. And it also handles interruptions to flows and disambiguation. It also handles requests to exit the bot. 

    09:00

    Nikita: Excellent! Let’s bring Himanshu in to tell us about machine learning services. Hi Himanshu! Let’s talk about OCI Data Science. Can you tell us a bit about it?

    Himanshu: OCI Data Science is the cloud service focused on serving the data scientist throughout the full machine learning life cycle with support for Python and open source. 

    The service has many features, such as model catalog, projects, JupyterLab notebook, model deployment, model training, management, model explanation, open source libraries, and AutoML. 

    09:35
    Lois: Himanshu, what are the core principles of OCI Data Science? 

    Himanshu: There are three core principles of OCI Data Science. The first one, accelerated. The first principle is about accelerating the work of the individual data scientist. OCI Data Science provides data scientists with open source libraries along with easy access to a range of compute power without having to manage any infrastructure. It also includes Oracle's own library to help streamline many aspects of their work. 
    The second principle is collaborative. It goes beyond an individual data scientist’s productivity to enable data science teams to work together. This is done through the sharing of assets, reducing duplicative work, and putting reproducibility and auditability of models for collaboration and risk management. 

    Third is enterprise grade. That means it's integrated with all the OCI Security and access protocols. The underlying infrastructure is fully managed. The customer does not have to think about provisioning compute and storage. And the service handles all the maintenance, patching, and upgrades so user can focus on solving business problems with data science. 

    10:50

    Nikita: Let’s drill down into the specifics of OCI Data Science. So far, we know it’s cloud service to rapidly build, train, deploy, and manage machine learning models. But who can use it? Where is it? And how is it used?

    Himanshu: It serves data scientists and data science teams throughout the full machine learning life cycle. 

    Users work in a familiar JupyterLab notebook interface, where they write Python code. And how it is used? So users preserve their models in the model catalog and deploy their models to a managed infrastructure. 

    11:25

    Lois: Walk us through some of the key terminology that’s used.

    Himanshu: Some of the important product terminology of OCI Data Science are projects. The projects are containers that enable data science teams to organize their work. They represent collaborative work spaces for organizing and documenting data science assets, such as notebook sessions and models. 

    Note that tenancy can have as many projects as needed without limits. Now, this notebook session is where the data scientists work. Notebook sessions provide a JupyterLab environment with pre-installed open source libraries and the ability to add others. Notebook sessions are interactive coding environment for building and training models. 

    Notebook sessions run in a managed infrastructure and the user can select CPU or GPU, the compute shape, and amount of storage without having to do any manual provisioning. The other important feature is Conda environment. It's an open source environment and package management system and was created for Python programs. 

    12:33

    Nikita: What is a Conda environment used for?

    Himanshu: It is used in the service to quickly install, run, and update packages and their dependencies. Conda easily creates, saves, loads, and switches between environments in your notebooks sessions.

    12:46

    Nikita: Earlier, you spoke about the support for Python in OCI Data Science. Is there a dedicated library?

    Himanshu: Oracle's Accelerated Data Science ADS SDK is a Python library that is included as part of OCI Data Science. 
    ADS has many functions and objects that automate or simplify the steps in the data science workflow, including connecting to data, exploring, and visualizing data. Training a model with AutoML, evaluating models, and explaining models. In addition, ADS provides a simple interface to access the data science service mode model catalog and other OCI services, including object storage. 

    13:24

    Lois: I also hear a lot about models. What are models?

    Himanshu: Models define a mathematical representation of your data and business process. You create models in notebooks, sessions, inside projects. 

    13:36

    Lois: What are some other important terminologies related to models?

    Himanshu: The next terminology is model catalog. The model catalog is a place to store, track, share, and manage models. 
    The model catalog is a centralized and managed repository of model artifacts. A stored model includes metadata about the provenance of the model, including Git-related information and the script. Our notebook used to push the model to the catalog. Models stored in the model catalog can be shared across members of a team, and they can be loaded back into a notebook session. 

    The next one is model deployments. Model deployments allow you to deploy models stored in the model catalog as HTTP endpoints on managed infrastructure. 

    14:24

    Lois: So, how do you operationalize these models?

    Himanshu: Deploying machine learning models as web applications, HTTP API endpoints, serving predictions in real time is the most common way to operationalize models. HTTP endpoints or the API endpoints are flexible and can serve requests for the model predictions. Data science jobs enable you to define and run a repeatable machine learning tasks on fully managed infrastructure. 

    Nikita: Thanks for that, Himanshu. 

    14:57

    Did you know that Oracle University offers free courses on Oracle Cloud Infrastructure? You’ll find training on everything from cloud computing, database, and security, artificial intelligence, and machine learning, all free to subscribers. So, what are you waiting for? Pick a topic, leverage the Oracle University Learning Community to ask questions, and then sit for your certification.

    Visit mylearn.oracle.com to get started. 

    15:25

    Nikita: Welcome back! The Oracle AI Stack consists of AI services and machine learning services, and these services are built using AI infrastructure. So, let’s move on to that. Hemant, what are the components of OCI AI Infrastructure?
    Hemant: OCI AI Infrastructure is mainly composed of GPU-based instances. Instances can be virtual machines or bare metal machines. High performance cluster networking that allows instances to communicate to each other. Super clusters are a massive network of GPU instances with multiple petabytes per second of bandwidth. And a variety of fully managed storage options from a single byte to exabytes without upfront provisioning are also available. 

    16:14

    Lois: Can we explore each of these components a little more? First, tell us, why do we need GPUs?

    Hemant: ML and AI needs lots of repetitive computations to be made on huge amounts of data. Parallel computing on GPUs is designed for many processes at the same time. A GPU is a piece of hardware that is incredibly good in performing computations. 
    GPU has thousands of lightweight cores, all working on their share of data in parallel. This gives them the ability to crunch through extremely large data set at tremendous speed. 

    16:54

    Nikita: And what are the GPU instances offered by OCI?

    Hemant: GPU instances are ideally suited for model training and inference. Bare metal and virtual machine compute instances powered by NVIDIA GPUs H100, A100, A10, and V100 are made available by OCI. 

    17:14

    Nikita: So how do we choose what to train from these different GPU options? 

    Hemant: For large scale AI training, data analytics, and high performance computing, bare metal instances BM 8 X NVIDIA H100 and BM 8 X NVIDIA A100 can be used. 

    These provide up to nine times faster AI training and 30 times higher acceleration for AI inferencing. The other bare metal and virtual machines are used for small AI training, inference, streaming, gaming, and virtual desktop infrastructure. 

    17:53

    Lois: And why would someone choose the OCI AI stack over its counterparts?

    Hemant: Oracle offers all the features and is the most cost effective option when compared to its counterparts. 

    For example, BM GPU 4.8 version 2 instance costs just $4 per hour and is used by many customers. 

    Superclusters are a massive network with multiple petabytes per second of bandwidth. It can scale up to 4,096 OCI bare metal instances with 32,768 GPUs. 

    We also have a choice of bare metal A100 or H100 GPU instances, and we can select a variety of storage options, like object store, or block store, or even file system. For networking speeds, we can reach 1,600 GB per second with A100 GPUs and 3,200 GB per second with H100 GPUs. 

    With OCI storage, we can select local SSD up to four NVMe drives, block storage up to 32 terabytes per volume, object storage up to 10 terabytes per object, file systems up to eight exabyte per file system. OCI File system employs five replicated storage located in different fault domains to provide redundancy for resilient data protection. 

    HPC file systems, such as BeeGFS and many others are also offered. OCI HPC file systems are available on Oracle Cloud Marketplace and make it easy to deploy a variety of high performance file servers. 

    19:50

    Lois: I think a discussion on AI would be incomplete if we don’t talk about responsible AI. We’re using AI more and more every day, but can we actually trust it?

    Hemant: For us to trust AI, it must be driven by ethics that guide us as well.

    Nikita: And do we have some principles that guide the use of AI?
    Hemant: AI should be lawful, complying with all applicable laws and regulations. AI should be ethical, that is it should ensure adherence to ethical principles and values that we uphold as humans. And AI should be robust, both from a technical and social perspective. Because even with the good intentions, AI systems can cause unintentional harm.

    AI systems do not operate in a lawless world. A number of legally binding rules at national and international level apply or are relevant to the development, deployment, and use of AI systems today. The law not only prohibits certain actions but also enables others, like protecting rights of minorities or protecting environment. Besides horizontally applicable rules, various domain-specific rules exist that apply to particular AI applications. For instance, the medical device regulation in the health care sector. 

    In AI context, equality entails that the systems’ operations cannot generate unfairly biased outputs. And while we adopt AI, citizens right should also be protected. 

    21:30

    Lois: Ok, but how do we derive AI ethics from these?

    Hemant: There are three main principles. 
    AI should be used to help humans and allow for oversight. It should never cause physical or social harm. Decisions taken by AI should be transparent and fair, and also should be explainable. AI that follows the AI ethical principles is responsible AI. 

    So if we map the AI ethical principles to responsible AI requirements, these will be like, AI systems should follow human-centric design principles and leave meaningful opportunity for human choice. This means securing human oversight. AI systems and environments in which they operate must be safe and secure, they must be technically robust, and should not be open to malicious use. 

    The development, and deployment, and use of AI systems must be fair, ensuring equal and just distribution of both benefits and costs. AI should be free from unfair bias and discrimination. Decisions taken by AI to the extent possible should be explainable to those directly and indirectly affected. 

    23:01

    Nikita: This is all great, but what does a typical responsible AI implementation process look like? 

    Hemant: First, a governance needs to be put in place. Second, develop a set of policies and procedures to be followed. And once implemented, ensure compliance by regular monitoring and evaluation. 

    Lois: And this is all managed by developers?

    Hemant: Typical roles that are involved in the implementation cycles are developers, deployers, and end users of the AI. 

    23:35

    Nikita: Can we talk about AI specifically in health care? How do we ensure that there is fairness and no bias?

    Hemant: AI systems are only as good as the data that they are trained on. If that data is predominantly from one gender or racial group, the AI systems might not perform as well on data from other groups. 

    24:00

    Lois: Yeah, and there’s also the issue of ensuring transparency, right?

    Hemant: AI systems often make decisions based on complex algorithms that are difficult for humans to understand. As a result, patients and health care providers can have difficulty trusting the decisions made by the AI. AI systems must be regularly evaluated to ensure that they are performing as intended and not causing harm to patients. 

    24:29

    Nikita: Thank you, Hemant and Himanshu, for this really insightful session. If you’re interested in learning more about the topics we discussed today, head on over to mylearn.oracle.com and search for the Oracle Cloud Infrastructure AI Foundations course. 

    Lois: That’s right, Niki. You’ll find demos that you watch as well as skill checks that you can attempt to better your understanding. In our next episode, we’ll get into the OCI AI Services we discussed today and talk about them in more detail. Until then, this is Lois Houston…

    Nikita: And Nikita Abraham, signing off!

    25:05

    That’s all for this episode of the Oracle University Podcast. If you enjoyed listening, please click Subscribe to get all the latest episodes. We’d also love it if you would take a moment to rate and review us on your podcast app. See you again on the next episode of the Oracle University Podcast.

    Generative AI and Large Language Models

    Generative AI and Large Language Models

    In this week’s episode, Lois Houston and Nikita Abraham, along with Senior Instructor Himanshu Raj, take you through the extraordinary capabilities of Generative AI, a subset of deep learning that doesn’t make predictions but rather creates its own content.

    They also explore the workings of Large Language Models.

    Oracle MyLearn: https://mylearn.oracle.com/ou/learning-path/become-an-oci-ai-foundations-associate-2023/127177

    Oracle University Learning Community: https://education.oracle.com/ou-community

    LinkedIn: https://www.linkedin.com/showcase/oracle-university/

    X (formerly Twitter): https://twitter.com/Oracle_Edu

    Special thanks to Arijit Ghosh, David Wright, and the OU Studio Team for helping us create this episode.

    --------------------------------------------------------

    Episode Transcript:

    00:00

    Welcome to the Oracle University Podcast, the first stop on your cloud journey. During this series of informative podcasts, we’ll bring you foundational training on the most popular Oracle technologies. Let’s get started!

    00:26

    Lois: Hello and welcome to the Oracle University Podcast. I’m Lois Houston, Director of Innovation Programs with Oracle University, and with me is Nikita Abraham, Principal 
    Technical Editor. 

    Nikita: Hi everyone! In our last episode, we went over the basics of deep learning. Today, we’ll look at generative AI and large language models, and discuss how they work. To help us with that, we have Himanshu Raj, Senior Instructor on AI/ML. So, let’s jump right in. Hi Himanshu, what is generative AI? 

    01:00

    Himanshu: Generative AI refers to a type of AI that can create new content. It is a subset of deep learning, where the models are trained not to make predictions but rather to generate output on their own. 

    Think of generative AI as an artist who looks at a lot of paintings and learns the patterns and styles present in them. Once it has learned these patterns, it can generate new paintings that resembles what it learned.

    01:27

    Lois: Let's take an example to understand this better. Suppose we want to train a generative AI model to draw a dog. How would we achieve this?

    Himanshu: You would start by giving it a lot of pictures of dogs to learn from. The AI does not know anything about what a dog looks like. But by looking at these pictures, it starts to figure out common patterns and features, like dogs often have pointy ears, narrow faces, whiskers, etc. You can then ask it to draw a new picture of a dog. 

    The AI will use the patterns it learned to generate a picture that hopefully looks like a dog. But remember, the AI is not copying any of the pictures it has seen before but creating a new image based on the patterns it has learned. This is the basic idea behind generative AI. In practice, the process involves a lot of complex maths and computation, and there are different techniques and architectures that can be used, such as variational autoencoders (VAs) and Generative Adversarial Networks (GANs). 

    02:27

    Nikita: Himanshu, where is generative AI used in the real world?

    Himanshu: Generative AI models have a wide variety of applications across numerous domains. For the image generation, generative models like GANs are used to generate realistic images. They can be used for tasks, like creating artwork, synthesizing images of human faces, or transforming sketches into photorealistic images. 

    For text generation, large language models like GPT 3, which are generative in nature, can create human-like text. This has applications in content creation, like writing articles, generating ideas, and again, conversational AI, like chat bots, customer service agents. They are also used in programming for code generation and debugging, and much more. 

    For music generation, generative AI models can also be used. They create new pieces of music after being trained on a specific style or collection of tunes. A famous example is OpenAI's MuseNet.

    03:21

    Lois: You mentioned large language models in the context of text-based generative AI. So, let’s talk a little more about it. Himanshu, what exactly are large language models?

    Himanshu: LLMs are a type of artificial intelligence models built to understand, generate, and process human language at a massive scale. They were primarily designed for sequence to sequence tasks such as machine translation, where an input sequence is transformed into an output sequence. 

    LLMs can be used to translate text from one language to another. For example, an LLM could be used to translate English text into French. To do this job, LLM is trained on a massive data set of text and code which allows it to learn the patterns and relationships that exist between different languages. The LLM translates, “How are you?” from English to French, “Comment allez-vous?” 

    It can also answer questions like, what is the capital of France? And it would answer the capital of France is Paris. And it will write an essay on a given topic. For example, write an essay on French Revolution, and it will come up with a response like with a title and introduction.

    04:33

    Lois: And how do LLMs actually work?

    Himanshu: So, LLM models are typically based on deep learning architectures such as transformers. They are also trained on vast amount of text data to learn language patterns and relationships, again, with a massive number of parameters usually in order of millions or even billions. LLMs have also the ability to comprehend and understand natural language text at a semantic level. They can grasp context, infer meaning, and identify relationships between words and phrases. 

    05:05

    Nikita: What are the most important factors for a large language model?

    Himanshu: Model size and parameters are crucial aspects of large language models and other deep learning models. They significantly impact the model’s capabilities, performance, and resource requirement. So, what is model size? The model size refers to the amount of memory required to store the model's parameter and other data structures. Larger model sizes generally led to better performance as they can capture more complex patterns and representation from the data. 

    The parameters are the numerical values of the model that change as it learns to minimize the model's error on the given task. In the context of LLMs, parameters refer to the weights and biases of the model's transformer layers. Parameters are usually measured in terms of millions or billions. For example, GPT-3, one of the largest LLMs to date, has 175 billion parameters making it extremely powerful in language understanding and generation. 

    Tokens represent the individual units into which a piece of text is divided during the processing by the model. In natural language, tokens are usually words, subwords, or characters. Some models have a maximum token limit that they can process and longer text can may require truncation or splitting. Again, balancing model size, parameters, and token handling is crucial when working with LLMs. 

    06:29

    Nikita: But what’s so great about LLMs?

    Himanshu: Large language models can understand and interpret human language more accurately and contextually. They can comprehend complex sentence structures, nuances, and word meanings, enabling them to provide more accurate and relevant responses to user queries. This model can generate human-like text that is coherent and contextually appropriate. This capability is valuable for context creation, automated writing, and generating personalized response in applications like chatbots and virtual assistants. They can perform a variety of tasks. 

    Large language models are very versatile and adaptable to various industries. They can be customized to excel in applications such as language translation, sentiment analysis, code generation, and much more. LLMs can handle multiple languages making them valuable for cross-lingual tasks like translation, sentiment analysis, and understanding diverse global content. 

    Large language models can be again, fine-tuned for a specific task using a minimal amount of domain data. The efficiency of LLMs usually grows with more data and parameters.

    07:34

    Lois: You mentioned the “sequence to sequence tasks” earlier. Can you explain the concept in simple terms for us?

    Himanshu: Understanding language is difficult for computers and AI systems. The reason being that words often have meanings based on context. Consider a sentence such as Jane threw the frisbee, and her dog fetched it. 

    In this sentence, there are a few things that relate to each other. Jane is doing the throwing. The dog is doing the fetching. And it refers to the frisbee. Suppose we are looking at the word “it” in the sentence. As a human, we understand easily that “it” refers to the frisbee. But for a machine, it can be tricky.

    The goal in sequence problems is to find patterns, dependencies, or relationships within the data and make predictions, classification, or generate new sequences based on that understanding.

    08:27

    Lois: And where are sequence models mostly used?

    Himanshu: Some common example of sequence models includes natural language processing, which we call NLP, tasks such as machine translation, text generation sentiment analysis, language modeling involve dealing with sequences of words or characters. 

    Speech recognition. Converting audio signals into text, involves working with sequences of phonemes or subword units to recognize spoken words. Music generation. Generating new music involves modeling musical sequences, nodes, and rhythms to create original compositions. 

    Gesture recognition. Sequences of motion or hand gestures are used to interpret human movements for applications, such as sign language recognition or gesture-based interfaces. Time series analysis. In fields such as finance, economics, weather forecasting, and signal processing, time series data is used to predict future values, detect anomalies, and understand patterns in temporal data.

    09:35

    The Oracle University Learning Community is an excellent place to collaborate and learn with Oracle experts and fellow learners. Grow your skills, inspire innovation, and celebrate your successes. All your activities, from liking a post to answering questions and sharing with others, will help you earn a valuable reputation, badges, and ranks to be recognized in the community.

    Visit mylearn.oracle.com to get started. 

    10:03

    Nikita: Welcome back! Himanshu, what would be the best way to solve those sequence problems you mentioned? Let’s use the same sentence, “Jane threw the frisbee, and her dog fetched it” as an example.

    Himanshu: The solution is transformers. It's like model has a bird's eye view of the entire sentence and can see how all the words relate to each other. This allows it to understand the sentence as a whole instead of just a series of individual words. Transformers with their self-attention mechanism can look at all the words in the sentence at the same time and understand how they relate to each other. 

    For example, transformer can simultaneously understand the connections between Jane and dog even though they are far apart in the sentence.

    10:52

    Nikita: But how?

    Himanshu: The answer is attention, which adds context to the text. Attention would notice dog comes after frisbee, fetched comes after dog, and it comes after fetched. 

    Transformer does not look at it in isolation. Instead, it also pays attention to all the other words in the sentence at the same time. But considering all these connections, the model can figure out that “it” likely refers to the frisbee. 

    The most famous current models that are emerging in natural language processing tasks consist of dozens of transformers or some of their variants, for example, GPT or Bert.

    11:32

    Lois: I was looking at the AI Foundations course on MyLearn and came across the terms “prompt engineering” and “fine tuning.” Can you shed some light on them?

    Himanshu: A prompt is the input or initial text provided to the model to elicit a specific response or behavior. So, this is something which you write or ask to a language model. Now, what is prompt engineering? So prompt engineering is the process of designing and formulating specific instructions or queries to interact with a large language model effectively. 
    In the context of large language models, such as GPT 3 or Burt, prompts are the input text or questions given to the model to generate responses or perform specific tasks. 

    The goal of prompt engineering is to ensure that the language model understands the user's intent correctly and provide accurate and relevant responses.

    12:26

    Nikita: That sounds easy enough, but fine tuning seems a bit more complex. Can you explain it with an example?

    Himanshu: Imagine you have a versatile recipe robot named chef bot. Suppose that chef bot is designed to create delicious recipes for any dish you desire. 

    Chef bot recognizes the prompt as a request for a pizza recipe, and it knows exactly what to do.

    However, if you want chef bot to be an expert in a particular type of cuisine, such as Italian dishes, you fine-tune chef bot for Italian cuisine by immersing it in a culinary crash course filled with Italian cookbooks, traditional Italian recipes, and even Italian cooking shows. 

    During this process, chef bot becomes more specialized in creating authentic Italian recipes, and this option is called fine tuning. LLMs are general purpose models that are pre-trained on large data sets but are often fine-tuned to address specific use cases. 

    When you combine prompt engineering and fine tuning, and you get a culinary wizard in chef bot, a recipe robot that is not only great at understanding specific dish requests but also capable of following a specific dish requests and even mastering the art of cooking in a particular culinary style.

    13:47

    Lois: Great! Now that we’ve spoken about all the major components, can you walk us through the life cycle of a large language model?

    Himanshu: The life cycle of a Large Language Model, LLM, involves several stages, from its initial pre-training to its deployment and ongoing refinement. 

    The first of this lifecycle is pre-training. The LLM is initially pre-trained on a large corpus of text data from the internet. During pre-training, the model learns grammar, facts, reasoning abilities, and general language understanding. The model predicts the next word in a sentence given the previous words, which helps it capture relationships between words and the structure of language. 

    The second phase is fine tuning initialization. After pre-training, the model's weights are initialized, and it's ready for task-specific fine tuning. Fine tuning can involve supervised learning on labeled data for specific tasks, such as sentiment analysis, translation, or text generation. 

    The model is fine-tuned on specific tasks using a smaller domain-specific data set. The weights from pre-training are updated based on the new data, making the model task aware and specialized. The next phase of the LLM life cycle is prompt engineering. So this phase craft effective prompts to guide the model's behavior in generating specific responses. 

    Different prompt formulations, instructions, or context can be used to shape the output. 

    15:13

    Nikita: Ok… we’re with you so far. What’s next?

    Himanshu: The next phase is evaluation and iteration. So models are evaluated using various metrics to access their performance on specific tasks. Iterative refinement involves adjusting model parameters, prompts, and fine tuning strategies to improve results. 

    So as a part of this step, you also do few shot and one shot inference. If needed, you further fine tune the model with a small number of examples. Basically, few shot or a single example, one shot for new tasks or scenarios. 

    Also, you do the bias mitigation and consider the ethical concerns. These biases and ethical concerns may arise in models output. You need to implement measures to ensure fairness in inclusivity and responsible use. 

    16:07

    Himanshu: The next phase in LLM life cycle is deployment. Once the model has been fine-tuned and evaluated, it is deployed for real world applications. Deployed models can perform tasks, such as text generation, translation, summarization, and much more. You also perform monitoring and maintenance in this phase. 

    So you continuously monitor the model's performance and output to ensure it aligns with desired outcomes. You also periodically update and retrain the model to incorporate new data and to adapt to evolving language patterns. This overall life cycle can also consist of a feedback loop, whether you gather feedbacks from users and incorporate it into the model’s improvement process. 

    You use this feedback to further refine prompts, fine tuning, and overall model behavior. RLHF, which is Reinforcement Learning with Human Feedback, is a very good example of this feedback loop. You also research and innovate as a part of this life cycle, where you continue to research and develop new techniques to enhance the model capability and address different challenges associated with it.

    17:19

    Nikita: As we’re talking about the LLM life cycle, I see that fine tuning is not only about making an LLM task specific. So, what are some other reasons you would fine tune an LLM model?
    Himanshu: The first one is task-specific adaptation. Pre-trained language models are trained on extensive and diverse data sets and have good general language understanding. They excel in language generation and comprehension tasks, though the broad understanding of language may not lead to optimal performance in specific task. 

    These models are not task specific. So the solution is fine tuning. The fine tuning process customizes the pre-trained models for a specific task by further training on task-specific data to adapt the model's knowledge. 

    The second reason is domain-specific vocabulary. Pre-trained models might lack knowledge of specific words and phrases essential for certain tasks in fields, such as legal, medical, finance, and technical domains. This can limit their performance when applied to domain-specific data. 

    Fine tuning enables the model to adapt and learn domain-specific words and phrases. These words could be, again, from different domains. 

    18:35

    Himanshu: The third reason to fine tune is efficiency and resource utilization. So fine tuning is computationally efficient compared to training from scratch. 

    Fine tuning reuses the knowledge from pre-trained models, saving time and resources. Fine tuning requires fewer iterations to achieve task-specific competence. Shorter training cycles expedite the model development process. It conserves computational resources, such as GPU memory and processing power. 

    Fine tuning is efficient in quicker model deployment. It has faster time to production for real world applications. Fine tuning is, again, a scalable enabling adaptation to various tasks with the same base model, which further reduce resource demands, and it leads to cost saving for research and development. 
    The fourth reason to fine tune is of ethical concerns. Pre-trained models learns from diverse data. And those potentially inherit different biases. Fine tune might not completely eliminate biases. But careful curation of task specific data ensures avoiding biased or harmful vocabulary. The responsible uses of domain-specific terms promotes ethical AI applications. 

    19:53

    Lois: Thank you so much, Himanshu, for spending time with us. We had such a great time learning from you. If you want to learn more about the topics discussed today, head over to mylearn.oracle.com and get started on our free AI Foundations course.

    Nikita: Yeah, we even have a detailed walkthrough of the architecture of transformers that you might want to check out. Join us next week for a discussion on the OCI AI Portfolio. Until then, this is Nikita Abraham…

    Lois: And Lois Houston signing off!

    20:24

    That’s all for this episode of the Oracle University Podcast. If you enjoyed listening, please click Subscribe to get all the latest episodes. We’d also love it if you would take a moment to rate and review us on your podcast app. See you again on the next episode of the Oracle University Podcast.

    Deep Learning

    Deep Learning

    Did you know that the concept of deep learning goes way back to the 1950s? However, it is only in recent years that this technology has created a tremendous amount of buzz (and for good reason!). A subset of machine learning, deep learning is inspired by the structure of the human brain, making it fascinating to learn about.

    In this episode, Lois Houston and Nikita Abraham interview Senior Principal OCI Instructor Hemant Gahankari about deep learning concepts, including how Convolution Neural Networks work, and help you get your deep learning basics right.

    Oracle MyLearn: https://mylearn.oracle.com/

    Oracle University Learning Community: https://education.oracle.com/ou-community

    LinkedIn: https://www.linkedin.com/showcase/oracle-university/

    X (formerly Twitter): https://twitter.com/Oracle_Edu

    Special thanks to Arijit Ghosh, David Wright, Himanshu Raj, and the OU Studio Team for helping us create this episode.

    --------------------------------------------------------

    Episode Transcript:

    00:00

    Welcome to the Oracle University Podcast, the first stop on your cloud journey. During this series of informative podcasts, we’ll bring you foundational training on the most popular Oracle technologies. Let’s get started!

    00:26

    Lois: Hello and welcome to the Oracle University Podcast. I’m Lois Houston, Director of Innovation Programs with Oracle University, and with me is Nikita Abraham, Principal Technical Editor.

    Nikita: Hi everyone! Last week, we covered the new MySQL HeatWave Implementation Associate certification. So do go check out that episode if it interests you.

    Lois: That was a really interesting discussion for sure. Today, we’re going to focus on the basics of deep learning with our Senior Principal OCI Instructor, Hemant Gahankari.

    00:58

    Nikita: Hi Hemant! Thanks for being with us today. So, to get started, what is deep learning?

    Hemant: Deep learning is a subset of machine learning that focuses on training Artificial Neural Networks to solve a task at hand. Say, for example, image classification. A very important quality of the ANN is that it can process raw data like pixels of an image and extract patterns from it. These patterns are treated as features to predict the outcomes. 

    Let us say we have a set of handwritten images of digits 0 to 9. As we know, everyone writes the digits in a slightly different way. So how do we train a machine to identify a handwritten digit? For this, we use ANN. 

    ANN accepts image pixels as inputs, extracts patterns like edges and curves and so on, and correlates these patterns to predict an outcome. That is what digit does the image has in this case. 

    02:04

    Lois: Ok, so what you’re saying is given a bunch of pixels, ANN is able to process pixel data, learn an internal representation of the data, and predict outcomes. That’s so cool! So, why do we need deep learning?

    Hemant: We need to specify features while we train machine learning algorithm. With deep learning, features are automatically extracted from the data. Internal representation of features and their combinations is built to predict outcomes by deep learning algorithms. This may not be feasible manually. 
    Deep learning algorithms can make use of parallel computations. For this, usually data is split into small batches and process parallelly. So these algorithms can process large amount of data in a short time to learn the features and their combinations. This leads to scalability and performance. In short, deep learning complements machine learning algorithms for complex data for which features cannot be described easily. 

    03:13

    Nikita: What can you tell us about the origins of deep learning?

    Hemant: Some of the deep learning concepts like artificial neuron, perceptron, and multilayer perceptron existed as early as 1950s. One of the most important concept of using backpropagation for training ANN came in 1980s. 

    In 1990s, convolutional neural network were also introduced for image analysis task. Starting 2000, GPUs were introduced. And 2010 onwards, GPUs became cheaper and widely available. This fueled the widespread adoption of deep learning uses like computer vision, natural language processing, speech recognition, text translation, and so on. 

    In 2012, major networks like AlexNet and Deep-Q Network were built. 2016 onward, generative use cases of the deep learning also started to come up. Today, we have widely adopted deep learning for a variety of use cases, including large language models and many other types of generative models. 

    04:29

    Lois: Hemant, what are various applications of deep learning algorithms? 

    Hemant: Deep learning algorithms are targeted at a variety of data and applications. For data, we have images, videos, text, and audio. For images, applications can be image classification, object detection, and so on. For textual data, applications are to translate the text or detect a sentiment of a text. For audio, the applications can be music generation, speech to text, and so on. 

    05:08

    Lois: It's important that we select the right deep learning algorithm based on the data and application, right? So how do we do that? 

    Hemant: For image task like image classification, object detection, image segmentation, or facial recognition, CNN is a suitable architecture. For text, we have a choice of the latest transformers or LSTM or even RNN. For generative tasks like text summarization, question answering, transformers is a good choice. For generating images, text to image generation, transformers, GANs, or diffusion models are available choice.

    05:51

    Nikita: Let’s dive a little deeper into Artificial Neural Networks. Can you tell us more about them, Hemant?
    Hemant: Artificial Neural Networks are inspired by the human brain. They are made up of interconnected nodes called as neurons. 

    Nikita: And how are inputs processed by a neuron? 

    Hemant: In ANN, we assign weights to the connection between neurons. Weighted inputs are added up. And if the sum crosses a specified threshold, the neuron is fired. And the outputs of a layer of neuron become an input to another layer. 

    06:27

    Lois: Hemant, tell us about the building blocks of ANN so we understand this better.

    Hemant: So first, building block is layers. We have input layer, output layer, and multiple hidden layers. The input layer and output layer are mandatory. And the hidden layers are optional. The second unit is neurons. Neurons are computational units, which accept an input and produce an output. 

    Weights determine the strength of connection between neurons. So the connection could be between input and a neuron, or it could be between a neuron and another neuron. Activation functions work on the weighted sum of inputs to a neuron and produce an output. Additional input to the neuron that allows a certain degree of flexibility is called as a bias. 

    07:27

    Nikita: I think we’ve got the components of ANN straight but maybe you should give us an example. You mentioned this example earlier…of needing to train ANN to recognize handwritten digits from images. How would we go about that?
    Hemant: For that, we have to collect a large number of digit images, and we need to train ANN using these images. 
    So, in this case, the images consist of 28 by 28 pixels which act as input layer. For the output, we have neurons-- 10 neurons which represent digits 0 to 9. And we have multiple hidden layers. So, in this case, we have two hidden layers which are consisting of 16 neurons each. 

    The hidden layers are responsible for capturing the internal representation of the raw image data. And the output layer is responsible for producing the desired outcomes. So, in this case, the desired outcome is the prediction of whether the digit is 0 or 1 or up to digit 9. 

    So how do we train this particular ANN? So the first thing we use the backpropagation algorithm. During training, we show an image to the ANN. Let us say it is an image of digit 2. So we expect output neuron for digit 2 to fire. But in real, let us say output neuron of a digit 6 fired. 

    09:12

    Lois: So, then, what do we do? 

    Hemant: We know that there is an error. So to correct an error, we adjust the weights of the connection between neurons based on a calculation, which we call as backpropagation algorithm. By showing thousands of images and adjusting the weights iteratively, ANN is able to predict correct outcome for most of the input images. This process of adjusting weights through backpropagation is called as model training. 

    09:48

    Do you have an idea for a new course or learning opportunity? We’d love to hear it! Visit the Oracle University Learning Community and share your thoughts with us on the Idea Incubator. Your suggestion could find a place in future development projects! Visit mylearn.oracle.com to get started. 

    10:09

    Nikita: Welcome back! Let’s move on to CNN. Hemant, what is a Convolutional Neural Network? 

    Hemant: CNN is a type of deep learning model specifically designed for processing and analyzing grid-like data, such as images and videos. In the ANN, the input image is converted to a single dimensional array and given as an input to the network.  
    But that does not work well with the image data because image data is inherently two dimensional. CNN works better with two dimensional data. The role of the CNN is to reduce the image into a form, which is easier to process and without losing features, which are critical for getting a good prediction. 

    10:53

    Lois: A CNN has different layers, right? Could you tell us a bit about them? 

    Hemant: The first one is input layer. Input layer is followed by feature extraction layers, which is a combination and repetition of multiple feature extraction layers, including convolutional layer with ReLu activation and a pooling layer. 

    And this is followed by a classification layer. These are the fully connected output layers, where the classification occurs as output classes. The feature extraction layers play a vital role in image classification.  

    11:33

    Nikita: Can you explain these layers with an example?

    Hemant: Let us say we have a robot to inspect a house and tell us what type of a house it is. It uses many tools for this purpose. The first tool is a blueprint detector. It scans different parts of the house, like walls, floors, or windows, and looks for specific patterns or features. 

    The second tool is a pattern highlighter. This tool marks areas detected by the blueprint detector. The next tool is a summarizer. It tries to capture the most significant features of every room. The next tool is house expert, which looks at all the highlighted patterns and features, and tries to understand the house. 

    The next tool is a guess maker. It assigns probabilities to the different possible house types. And finally, the quality checker randomly checks different parts of the analysis to make sure that the robot doesn't rely too much on any single piece of information. 

    12:40

    Nikita: Ok, so how are you mapping these to the feature extraction layers? 

    Hemant: Similar to blueprint detector, we have a convolutional layer. This layer applies convolutional operations to the input image using small filters known as kernels. 

    Each filter slides across the input image to detect specific features, such as edges, corners, or textures. Similar to pattern highlighter, we have a activation function. The activation function allows the network to learn more complex and non-linear relationships in the data. Pooling layer is similar to room summarizer. 

    Pooling helps reduce the spatial dimensions of the feature maps generated by the convolutional layers. Similar to house expert, we have a fully connected layer, which is responsible for making final predictions or classifications based on the learned features. Softmax layer converts the output of the last fully connected layers into probability scores. 

    The class with the highest probability is the predicted class. This is similar to the guess maker. And finally, we have the dropout layer. This layer is a regularization technique used to prevent overfitting in the network. This has the same role as that of a quality checker. 

    14:05

    Lois: Do CNNs have any limitations that we need to be aware of?

    Hemant: Training CNNs on large data sets can be computationally expensive and time consuming. CNNs are susceptible to overfitting, especially when the training data is limited or imbalanced. CNNs are considered black box models making it difficult to interpret. 

    And CNNs can be sensitive to small changes in the input leading to unstable predictions. 

    14:33

    Nikita: And what are the top applications of CNN?
    Hemant: One of the most widely used applications of CNNs is image classification. For example, classifying whether an image contains a specific object, say cat or a dog. 

    CNNs are used for object detection tasks. The goal here is to draw bounding boxes around objects in an image. CNNs can perform pixel level segmentation, where each pixel in the image is labeled to represent different objects or regions. CNNs are employed for face recognition tasks as well, identifying and verifying individuals based on facial features. 

    CNNs are widely used in medical image analysis, helping with tasks like tumor detection, diagnosis, and classification of various medical conditions. CNNs play an important role in the development of self-driving cars, helping them to recognize and understand the road traffic signs, pedestrians, and other vehicles. And CNNs are applied in analyzing satellite images and remote sensing data for tasks, such as land cover classification and environmental monitoring. 

    15:50

    Nikita: Hemant, let’s talk about sequence models. What are they and what are they used for?

    Hemant: Sequence models are used to solve problems, where the input data is in the form of sequences. The sequences are ordered lists of data points or events. 

    The goal in sequence models is to find patterns and dependencies within the data and make predictions, classifications, or even generate new sequences. 

    16:17

    Lois: Can you give us some examples of sequence models? 

    Hemant: Some common examples of the sequence models are in natural language processing, deep learning models are used for tasks, such as machine translation, sentiment analysis, or text generation. In speech recognition, deep learning models are used to convert a recorded audio into text. 

    In deep learning models, can generate new music or create original compositions. Even sequences of hand gestures are interpreted by deep learning models for applications like sign language recognition. In fields like finance or weather prediction, time series data is used to predict future values. 

    17:03

    Nikita: Which deep learning models can be used to work with sequence data? 

    Hemant: Recurrent Neural Networks, abbreviated as RNNs, are a class of neural network architectures specifically designed to handle sequential data. Unlike traditional feedforward neural network, RNNs have a feedback loop that allows information to persist across different timesteps. 

    The key features of RNN is their ability to maintain an internal state often referred to as a hidden state or memory, which is updated as the network processes each element in the input sequence. The hidden state is then used as input to the network for the next time step, allowing the model to capture dependencies and patterns in the data that are spread across time. 

    17:58

    Nikita: Are there various types of RNNs?

    Hemant: There are different types of RNN architecture based on application. 

    One of them is one to one. This is like feed forward neural network and is not suited for sequential data. A one to many model produces multiple output values for one input value. Music generation or sequence generation are some applications using this architecture. 

    A many to one model produces one output value after receiving multiple input values. Example is sentiment analysis based on the review. Many to many model produces multiple output values for multiple input values. Examples are machine translation and named entity recognition. 

    RNN does not perform that well when it comes to capturing long term dependencies. This is due to the vanishing gradients problem, which is overcome by using LSTM model. 

    19:07

    Lois: Another acronym. What is LSTM, Hemant?

    Hemant: Long Short-Term memory, abbreviated as LSTM, works by using a specialized memory cell and a gating mechanisms to capture long term dependencies in the sequential data. 
    The key idea behind LSTM is to selectively remember or forget information over time, enabling the model to maintain relevant information over long sequences, which helps overcome the vanishing gradients problem. 

    19:40

    Nikita: Can you take us, step-by-step, through the working of LSTM? 

    Hemant: At each timestep, the LSTM takes an input vector representing the current data point in the sequence. The LSTM also receives the previous hidden state and cell state. These represent what the LSTM has remembered and forgotten up to the current point in the sequence. 

    The core of the LSTM lies in its gating mechanisms, which include three gates: the input gate, the forget gate, and the output gate. These gates are like the filters that control the flow of information within the LSTM cell. The input gate decides what new information from the current input should be added to the memory cell. 

    The forget gate determines what information in the current memory cell should be discarded or forgotten. The output gate regulates how much of the current memory cell should be exposed as the output of the current time step. Using the information from the input gate and forget gate, the LSTM updates its cell state. The LSTM then uses the output gate to produce the current hidden state, which becomes the output of the LSTM for the next time step. 

    21:12

    Lois: Thank you, Hemant, for joining us in this episode of the Oracle University Podcast. I learned so much today. If you want to learn more about deep learning, visit mylearn.oracle.com and search for the Oracle Cloud Infrastructure AI Foundations course. And remember, the AI Foundations course and certification are free. So why not get started now?

    Nikita: Right, Lois. In our next episode, we will discuss generative AI and language learning models. Until then, this is Nikita Abraham…

    Lois: And Lois Houston signing off!

    21:45

    That’s all for this episode of the Oracle University Podcast. If you enjoyed listening, please click Subscribe to get all the latest episodes. We’d also love it if you would take a moment to rate
    and review us on your podcast app. See you again on the next episode of the Oracle University Podcast.

    Everything You Need to Know About the MySQL HeatWave Implementation Associate Certification

    Everything You Need to Know About the MySQL HeatWave Implementation Associate Certification

    What is MySQL HeatWave? How do I get certified in it? Where do I start?

    Listen to Lois Houston and Nikita Abraham, along with MySQL Developer Scott Stroz, answer all these questions and more on this week's episode of the Oracle University Podcast.

    MySQL Document Store: https://oracleuniversitypodcast.libsyn.com/mysql-document-store

    Oracle MyLearn: https://mylearn.oracle.com/

    Oracle University Learning Community: https://education.oracle.com/ou-community

    LinkedIn: https://www.linkedin.com/showcase/oracle-university/

    X (formerly Twitter): https://twitter.com/Oracle_Edu

    Special thanks to Arijit Ghosh, David Wright, and the OU Studio Team for helping us create this episode.

    --------------------------------------------------------

    Episode Transcript:

    00:00

    Welcome to the Oracle University Podcast, the first stop on your cloud journey. During this 
    series of informative podcasts, we’ll bring you foundational training on the most popular 
    Oracle technologies. Let’s get started!

    00:26

    Nikita: Welcome to the Oracle University Podcast! I’m Nikita Abraham, Principal Technical Editor with Oracle University, and with me is Lois Houston, Director of Innovation Programs.

    Lois: Hi there! For the last two weeks, we’ve been having really exciting discussions on everything AI. We covered the basics of artificial intelligence and machine learning, and we’re taking a short break from that today to talk about the new MySQL HeatWave Implementation Associate Certification with MySQL Developer Advocate Scott Stroz.
    00:59
    Nikita: You may remember Scott from an episode last year where he came on to discuss MySQL Document Store. We’ll post the link to that episode in the show notes so you can listen to it if you haven’t already.

    Lois: Hi Scott! Thanks for joining us again. Before diving into the certification, tell us, what is MySQL HeatWave? 

    01:19

    Scott: Hi Lois, Hi Niki. I’m so glad to be back. So, MySQL HeatWave Database Service is a fully managed database that is capable of running transactional and analytic queries in a single database instance. This can be done across data warehouses and data lakes. We get all the benefits of analytic queries without the latency and potential security issues of performing standard extract, transform, and load, or ETL, operations. Some other MySQL HeatWave database service features are automated system updates and database backups, high availability, in-database machine learning with AutoML, MySQL Autopilot for managing instance provisioning, and enhanced data security. 

    HeatWave is the only cloud database service running MySQL that is built, managed, and supported by the MySQL Engineering team.

    02:14

    Lois: And where can I find MySQL HeatWave?

    Scott: MySQL HeatWave is only available in the cloud. MySQL HeatWave instances can be provisioned in Oracle Cloud Infrastructure or OCI, Amazon Web Services (AWS), and Microsoft Azure. Now, some features though are only available in Oracle Cloud, such as access to MySQL Document Store.

    02:36

    Nikita: Scott, you said MySQL HeatWave runs transactional and analytic queries in a single instance. Can you elaborate on that?

    Scott: Sure, Niki. So, MySQL HeatWave allows developers, database administrators, and data analysts to run transactional queries (OLTP) and analytic queries (OLAP). 

    OLTP, or online transaction processing, allows for real-time execution of database transactions. A transaction is any kind of insertion, deletion, update, or query of data. Most DBAs and developers work with this kind of processing in their day-to-day activities.
     
    OLAP, or online analytical processing, is one way to handle multi-dimensional analytical queries typically used for reporting or data analytics. OLTP system data must typically be exported, aggregated, and imported into an OLAP system. This procedure is called ETL as I mentioned – extract, transform, and load. With large datasets, ETL processes can take a long time to complete, so analytic data could be “old” by the time it is available in an OLAP system. There is also an increased security risk in moving the data to an external source.

    03:56

    Scott: MySQL HeatWave eliminates the need for time-consuming ETL processes. We can actually get real-time analytics from our data since HeatWave allows for OLTP and OLAP in a single instance. I should note, this also includes analytic from JSON data that may be stored in the database.

    Another advantage is that applications can use MySQL HeatWave without changing any of the application code. Developers only need to point their applications at the MySQL HeatWave databases. MySQL HeatWave is fully compatible with on-premise MySQL instances, which can allow for a seamless transition to the cloud.

    And one other thing. When MySQL HeatWave has OLAP features enabled, MySQL can determine what type of query is being executed and route it to either the normal database system or the in-memory database.

    04:52

    Lois: That’s so cool! And what about the other features you mentioned, Scott? Automated updates and backups, high availability…

    Scott: Right, Lois. But before that, I want to tell you about the in-memory query accelerator. MySQL HeatWave offers a massively parallel, in-memory hybrid columnar query processing engine. It provides high performance by utilizing algorithms for distributed query processing. And this query processing in MySQL HeatWave is optimized for cloud environments. 

    MySQL HeatWave can be configured to automatically apply system updates, so you will always have the latest and greatest version of MySQL.

    Then, we have automated backups. By this, I mean MySQL HeatWave can be configured to provide automated backups with point-in-time recovery to ensure data can be restored to a particular date and time. MySQL HeatWave also allows us to define a retention plan for our database backups, that means how long we keep the backups before they are deleted.

    High availability with MySQL HeatWave allows for more consistent uptime. When using high availability, MySQL HeatWave instances can be provisioned across multiple availability domains, providing automatic failover for when the primary node becomes unavailable. All availability domains within a region are physically separated from each other to mitigate the possibility of a single point of failure.

    06:14

    Scott: We also have MySQL Lakehouse. Lakehouse allows for the querying of data stored in object storage in various formats. This can be CSV, Parquet, Avro, or an export format from other database systems. And basically, we point Lakehouse at data stored in Oracle Cloud, and once it’s ingested, the data can be queried just like any other data in a database. Lakehouse supports querying data up to half a petabyte in size using the HeatWave engine. And this allows users to take advantage of HeatWave for non-MySQL workloads.

    MySQL AutoPilot is a part of MySQL HeatWave and can be used to predict the number of HeatWave nodes a system will need and automatically provision them as part of a cluster. AutoPilot has features that can handle automatic thread pooling and database shape predicting. A “shape” is one of the many different CPU, memory, and ethernet traffic configurations available for MySQL HeatWave.

    MySQL HeatWave includes some advanced security features such as asymmetric encryption and automated data masking at query execution.

    As you can see, there are a lot of features covered under the HeatWave umbrella!
    07:31

    Did you know that Oracle University offers free courses on Oracle Cloud Infrastructure? You’ll find training on everything from cloud computing, database, and security to artificial intelligence and machine learning, all free to subscribers. So, what are you waiting for? Pick a topic, leverage the Oracle University Learning Community to ask questions, and then sit for your certification. Visit mylearn.oracle.com to get started. 
    08:02

    Nikita: Welcome back! Now coming to the certification, who can actually take this exam, Scott?

    Scott: The MySQL HeatWave Implementation Associate Certification Exam is designed specifically for administrators and data scientists who want to provision, configure, and manage MySQL HeatWave for transactions, analytics, machine learning, and Lakehouse.

    08:22

    Nikita: Can someone who’s just graduated, say an engineering graduate interested in data analytics, take this certification? Are there any prerequisites? What are the career prospects for them?

    Scott: There are no mandatory prerequisites, but anyone who wants to take the exam should have experience with MySQL HeatWave and other aspects of OCI, such as virtual cloud networks and identity and security processes. Also, the learning path on MyLearn will be extremely helpful when preparing for the exam, but you are not required to complete the learning path before registering for the exam.

    The exam focuses more on getting MySQL HeatWave running (and keeping it running) than accessing the data. That doesn’t mean it is not helpful for someone interested in data analytics. I think it can be helpful for data analysts to understand how the system providing the data functions, even if it is at just a high level. It is also possible that data analysts might be responsible for setting up their own systems and importing and managing their own data.

    09:23

    Lois: And how do I get started if I want to get certified on MySQL HeatWave?

    Scott: So, you’ll first need to go to mylearn.oracle.com and look for the “Become a MySQL HeatWave Implementation Associate” learning path. The learning path consists of over 10 hours of training across 8 different courses. 

    These courses include “Getting Started with MySQL HeatWave Database Service,” which offers an introduction to some Oracle Cloud functionality such as security and networking, as well as showing one way to connect to a MySQL HeatWave instance. Another course demonstrates how to configure MySQL instances and copy that configuration to other instances. Other courses cover how to migrate data into MySQL HeatWave, set up and manage high availability, and configure HeatWave for OLAP.

    You’ll find labs where you can perform hands-on activities, student and activity guides, and skill checks to test yourself along the way. And there’s also the option to Ask the Instructor if you have any questions you need answers to. You can also access the Oracle University Learning Community and discuss topics with others on the same journey. The learning path includes a practice exam to check your readiness to pass the certification exam.

    10:33

    Lois: Yeah, and remember, access to the entire learning path is free so there’s nothing stopping you from getting started right away. Now Scott, what does the certification test you on?

    Scott: The MySQL HeatWave Implementation exam, which is an associate-level exam, covers various topics. It will validate your ability to identify key features and benefits of MySQL HeatWave and describe the MySQL HeatWave architecture; identify Virtual Cloud Network (VCN) requirements and the different methods of connecting to a MySQL HeatWave instance; manage the automatic backup process and restore database systems from these backups; configure and manage read replicas and inbound replication channels; import data into MySQL HeatWave; configure and manage high availability and clustering of MySQL HeatWave instances.

    I know this seems like a lot of different topics. That is why we recommend anyone interested in the exam follow the learning path. It will help make sure you have the exposure to all the topics that are covered by the exam.

    11:35

    Lois: Tell us more about the certification process itself.

    Scott: While the courses we already talked about are valuable when preparing for the exam, nothing is better than hands-on experience. We recommend that candidates have hands-on experience with MySQL HeatWave with real-world implementations. The format of the exam is Multiple Choice. It is 90 minutes long and consists of 65 questions. When you’ve taken the recommended training and feel ready to take the certification exam, you need to purchase the exam and register for it. You go through the section on things to do before the exam and the exam policies, and then all that’s left to do is schedule the date and time of the exam according to when is convenient for you.

    12:16

    Nikita: And once you’ve finished the exam?

    Scott: When you’re done your score will be displayed on the screen when you finish the exam. You will also receive an email indicating whether you passed or failed. You can view your exam results and full score report in Oracle CertView, Oracle’s certification portal. From CertView, you can download and print your eCertificate and even share your newly earned badge on places like Facebook, Twitter, and LinkedIn.

    12:38

    Lois: And for how long does the certification remain valid, Scott?

    Scott: There is no expiration date for the exam, so the certification will remain valid for as long as the material that is covered remains relevant. 

    12:49

    Nikita: What’s the next step for me after I get this certification? What other training can I take?

    Scott: So, because this exam is an associate level exam, it is kind of a stepping stone along a person’s MySQL training. I do not know if there are plans for a professional level exam for HeatWave, but Oracle University has several other training programs that are MySQL-specific. There are learning paths to help prepare for the MySQL Database Administrator and MySQL Database Developer exams. As with the HeatWave learning paths, the learning paths for these exams include video tutorials, hands-on activities, skill checks, and practice exams.

    13:27

    Lois: I think you’ve told us everything we need to know about this certification, Scott. Are there any parting words you might have?

    Scott: We know that the whole process of training and getting certified may seem daunting, but we’ve really tried to simplify things for you with the “Become a MySQL HeatWave Implementation Associate” learning path. It not only prepares you for the exam but also gives you experience with features of MySQL HeatWave that will surely be valuable in your career.

    13:51

    Lois: Thanks so much, Scott, for joining us today.

    Nikita: Yeah, we’ve had a great time with you.

    Scott: Thanks for having me.

    Lois: Next week, we’ll get back to our focus on AI with a discussion on deep learning. Until then, this is Lois Houston…

    Nikita: And Nikita Abraham, signing off.

    14:07

    That’s all for this episode of the Oracle University Podcast. If you enjoyed listening, please click
    Subscribe to get all the latest episodes. We’d also love it if you would take a moment to rate
    and review us on your podcast app. See you again on the next episode of the Oracle University 
    Podcast.

    Machine Learning

    Machine Learning

    Does machine learning feel like too convoluted a topic? Not anymore!

    Listen to hosts Lois Houston and Nikita Abraham, along with Senior Principal OCI Instructor Hemant Gahankari, talk about foundational machine learning concepts and dive into how supervised learning, unsupervised learning, and reinforcement learning work.

    Oracle MyLearn: https://mylearn.oracle.com/

    Oracle University Learning Community: https://education.oracle.com/ou-community

    LinkedIn: https://www.linkedin.com/showcase/oracle-university/

    X (formerly Twitter): https://twitter.com/Oracle_Edu

    Special thanks to Arijit Ghosh, David Wright, Himanshu Raj, and the OU Studio Team for helping us create this episode.

    ---------------------------------------------------------

    Episode Transcript:

     

    00:00


    Welcome to the Oracle University Podcast, the first stop on your cloud journey. During this 

    series of informative podcasts, we’ll bring you foundational training on the most popular 

    Oracle technologies. Let’s get started! 

    00:26

    Lois: Hello and welcome to the Oracle University Podcast. I’m Lois Houston, Director of Innovation Programs with Oracle University, and with me is Nikita Abraham, Principal 

    Technical Editor.

    Nikita: Hi everyone! Last week, we went through the basics of artificial intelligence and we’re going to take it a step further today by talking about some foundational machine learning concepts. After that, we’ll discuss the three main types of machine learning models: supervised learning, unsupervised learning, and reinforcement learning.

    00:57

    Lois: Hemant Gahankari, a Senior Principal OCI Instructor, joins us for this episode. Hi Hemant! Let’s dive right in. What is machine learning? How does it work?

    Hemant: Machine learning is a subset of artificial intelligence that focuses on creating computer systems that can learn and predict outcomes from given examples without being explicitly programmed. It is powered by algorithms that incorporate intelligence into machines by automatically learning from a set of examples usually provided as data.

    01:34

    Nikita: Give us a few examples of machine learning… so we can see what it can do for us.

    Hemant: Machine learning is used by all of us in our day-to-day life.

    When we shop online, we get product recommendations based on our preferences and our shopping history. This is powered by machine learning.

    We are notified about movies recommendations based on our viewing history and choices of other similar viewers. This too is driven by machine learning.

    While browsing emails, we are warned of a spam mail because machine learning classifies whether the mail is spam or not based on its content. In the increasingly popular self-driving cars, machine learning is responsible for taking the car to its destination.

    02:24

    Lois: So, how does machine learning actually work?

    Hemant: Let us say we have a computer and we need to teach the computer to differentiate between a cat and a dog. We do this by describing features of a cat or a dog.

    Dogs and cats have distinguishing features. For example, the body color, texture, eye color are some of the defining features which can be used to differentiate a cat from a dog. These are collectively called as input data.

    We also provide a corresponding output, which is called as a label, which can be a dog or a cat in this case. By describing a specific set of features, we can say that it is a cat or a dog.

    Machine learning model is first trained with the data set. Training data set consists of a set of features and output labels, and is given as an input to the machine learning model.

    During the process of training, machine learning model learns the relation between input features and corresponding output labels from the provided data. Once the model learns from the data, we have a trained model.

    Once the model is trained, it can be used for inference. Inference is a process of getting a prediction by giving a data point. In this example, we input features of a cat or a dog, and the trained model predicts the output that is a cat or a dog label.

    The types of machine learning models depend on whether we have a labeled output or not.

    04:08

    Nikita: Oh, there are different types of machine learning models?

    Hemant: In general, there are three types of machine learning approaches.

    In supervised machine learning, labeled data is used to train the model. Model learns the relation between features and labels.

    Unsupervised learning is generally used to understand relationships within a data set. Labels are not used or are not available.

    Reinforcement learning uses algorithms that learn from outcomes to make decisions or choices.

    04:45

    Lois: Ok…supervised learning, unsupervised learning, and reinforcement learning. Where do we use each of these machine learning models?

    Hemant: Some of the popular applications of supervised machine learning are disease detection, weather forecasting, stock price prediction, spam detection, and credit scoring. For example, in disease detection, the patient data is input to a machine learning model, and machine learning model predicts if a patient is suffering from a disease or not.

    For unsupervised machine learning, some of the most common real-time applications are to detect fraudulent transactions, customer segmentation, outlier detection, and targeted marketing campaigns. So for example, given the transaction data, we can look for patterns that lead to fraudulent transactions.

    Most popular among reinforcement learning applications are automated robots, autonomous driving cars, and playing games.

    05:51

    Nikita: I want to get into how each type of machine learning works. Can we start with supervised learning?

    Hemant: Supervised learning is a machine learning model that learns from labeled data. The model learns the mapping between the input and the output.

    As a house price predictor model, we input house size in square feet and model predicts the price of a house. Suppose we need to develop a machine learning model for detecting cancer, the input to the model would be the person's medical details, the output would be whether the tumor is malignant or not.

    06:29

    Lois: So, that mapping between the input and output is fundamental in supervised learning.

    Hemant: Supervised learning is similar to a teacher teaching student. The model is trained with the past outcomes and it learns the relationship or mapping between the input and output.

    In supervised machine learning model, the outputs can be either categorical or continuous. When the output is continuous, we use regression. And when the output is categorical, we use classification.

    07:05

    Lois: We want to keep this discussion at a high level, so we’re not going to get into regression and classification. But if you want to learn more about these concepts and look at some demonstrations, visit mylearn.oracle.com.

    Nikita: Yeah, look for the Oracle Cloud Infrastructure AI Foundations course and you’ll find a lot of resources that you can make use of.

    07:30

    The Oracle University Learning Community is an excellent place to collaborate and learn with Oracle experts and fellow learners. Grow your skills, inspire innovation, and celebrate your successes. All your activities, from liking a post to answering questions and sharing with others, will help you earn a valuable reputation, badges, and ranks to be recognized in the community.

    Visit mylearn.oracle.com to get started. 

    07:58

    Nikita: Welcome back! So that was supervised machine learning. What about unsupervised machine learning, Hemant?

    Hemant: Unsupervised machine learning is a type of machine learning where there are no labeled outputs. The algorithm learns the patterns and relationships in the data and groups similar data items. In unsupervised machine learning, the patterns in the data are explored explicitly without being told what to look for.

    For example, if you give a set of different-colored LEGO pieces to a child and ask to sort it, it may the LEGO pieces based on any patterns they observe. It could be based on same color or same size or same type. Similarly, in unsupervised learning, we group unlabeled data sets.

    One more example could be-- say, imagine you have a basket of various fruits-- say, apples, bananas, and oranges-- and your task is to group these fruits based on their similarities. You observe that some fruits are round and red, while others are elongated and yellow. Without being told explicitly, you decide to group the round and red fruits together as one cluster and the elongated and yellow fruits as another cluster. There you go. You have just performed an unsupervised learning task.

    09:21

    Lois: Where is unsupervised machine learning used? Can you take us through some use cases?

    Hemant: The first use case of unsupervised machine learning is market segmentation. In market segmentation, one example is providing the purchasing details of an online shop to a clustering algorithm. Based on the items purchased and purchasing behavior, the clustering algorithm can identify customers based on the similarity between the products purchased. For example, customers with a particular age group who buy protein diet products can be shown an advertisement of sports-related products.

    The second use case is on outlier analysis. One typical example for outlier analysis is to provide credit card purchase data for clustering. Fraudulent transactions can be detected by a bank by using outliers. In some transaction, amounts are too high or recurring. It signifies an outlier.

    The third use case is recommendation systems. An example for recommendation systems is to provide users' movie viewing history as input to a clustering algorithm. It clusters users based on the type or rating of movies they have watched. The output helps to provide personalized movie recommendations to users. The same applies for music recommendations also.

    10:53

    Lois: And finally, Hemant, let’s talk about reinforcement learning.

    Hemant: Reinforcement learning is like teaching a dog new tricks. You reward it when it does something right, and over time, it learns to perform these actions to get more rewards. Reinforcement learning is a type of Machine Learning that enables an agent to learn from its interaction with the environment, while receiving feedback in the form of rewards or penalties without any labeled data.

    Reinforcement learning is more prevalent in our daily lives than we might realize. The development of self-driving cars and autonomous drones rely heavily on reinforcement learning to make real time decisions based on sensor data, traffic conditions, and safety considerations.

    Many video games, virtual reality experiences, and interactive entertainment use reinforcement learning to create intelligent and challenging computer-controlled opponents. The AI characters in games learn from player interactions and become more difficult to beat as the game progresses.

    12:05

    Nikita: Hemant, take us through some of the terminology that’s used with reinforcement learning.

    Hemant: Let us say we want to train a self-driving car to drive on a road and reach its destination. For this, it would need to learn how to steer the car based on what it sees in front through a camera. Car and its intelligence to steer on the road is called as an agent.

    More formally, agent is a learner or decision maker that interacts with the environment, takes actions, and learns from the feedback received. Environment, in this case, is the road and its surroundings with which the car interacts. More formally, environment is the external system with which the agent interacts. It is the world or context in which the agent operates and receives feedback for its actions.

    What we see through a camera in front of a car at a moment is a state. State is a representation of the current situation or configuration of the environment at a particular time. It contains the necessary information for the agent to make decisions. The actions in this example are to drive left, or right, or keep straight. Actions are a set of possible moves or decisions that the agent can take in a given state.

    Actions have an impact on the environment and influence future states. After driving through the road many times, the car learns what action to take when it views a road through the camera. This learning is a policy. Formally, policy is a strategy or mapping that the agent uses to decide which action to take in a given state. It defines the agent's behavior and determines how it selects actions.

    13:52

    Lois: Ok. Say we’re talking about the training loop of reinforcement learning in the context of training a dog to learn tricks. We want it to pick up a ball, roll, sit…

    Hemant: Here the dog is an agent, and the place it receives training is the environment. While training the dog, you provide a positive reward signal if the dog picks it right and a warning or punishment if the dog does not pick up a trick. In due course, the dog gets trained by the positive rewards or negative punishments.

    The same tactics are applied to train a machine in the reinforcement learning. For machines, the policy is the brain of our agent. It is a function that tells what actions to take when in a given state. The goal of reinforcement learning algorithm is to find a policy that will yield a lot of rewards for the agent if the agent follows that policy referred to as the optimal policy.

    Through a process of learning from experiences and feedback, the agent becomes more proficient at making good decisions and accomplishing tasks. This process continues until eventually we end up with the optimal policy. The optimal policy is learned through training by using algorithms like Deep Q Learning or Q Learning.

    15:19

    Nikita: So through multiple training iterations, it gets better. That’s fantastic. Thanks, Hemant, for joining us today. We’ve learned so much from you.

    Lois: Remember, the course and certification are free, so if you’re interested, make sure you log in to mylearn.oracle.com and get going. Join us next week for another episode of the Oracle University Podcast. Until then, I’m Lois Houston…

    Nikita: And Nikita Abraham signing off!

    15:48

     

    That’s all for this episode of the Oracle University Podcast. If you enjoyed listening, please click Subscribe to get all the latest episodes. We’d also love it if you would take a moment to rate and review us on your podcast app. See you again on the next episode of the Oracle University Podcast.

    Introduction to Artificial Intelligence (AI)

    Introduction to Artificial Intelligence (AI)
    You probably interact with artificial intelligence (AI) more than you realize. So, there’s never been a better time to start figuring out how it all works.
     
    Join Lois Houston and Nikita Abraham as they decode the fundamentals of AI so that anyone, irrespective of their technical background, can leverage the benefits of AI and tap into its infinite potential.
     
    Together with Senior Cloud Engineer Nick Commisso, they take you through key AI concepts, common AI tasks and domains, and the primary differences between AI, machine learning, and deep learning.
     
     
    Oracle University Learning Community: https://education.oracle.com/ou-community
     
     
    X (formerly Twitter): https://twitter.com/Oracle_Edu
     
    Special thanks to Arijit Ghosh, David Wright, Himanshu Raj, and the OU Studio Team for helping us create this episode.
     
    --------------------------------------------------------
     
    Episode Transcript

    00:00

    Welcome to the Oracle University Podcast, the first stop on your cloud journey. During this series of informative podcasts, we’ll bring you foundational training on the most popular Oracle technologies. Let’s get started!

    00:26

    Nikita: Hello and welcome to the Oracle University Podcast. I’m Nikita Abraham, Principal Technical Editor with Oracle University, and with me is Lois Houston, Director of Innovation Programs.

    Lois: Hi there! Welcome to a new season of the Oracle University Podcast. I’m so excited about this season because we’re going to delve into the world of artificial intelligence. In upcoming episodes, we’ll talk about the fundamentals of artificial intelligence and machine learning. And we’ll discuss neural network architectures, generative AI and large language models, the OCI AI stack, and OCI AI services.

    01:06

    Nikita: So, if you’re an IT professional who wants to start learning about AI and ML or even if you’re a student who is familiar with OCI or similar cloud services, but have no prior exposure to this field, you’ll want to tune in to these episodes.
    Lois: That’s right, Niki. So, let’s get started. Today, we’ll talk about the basics of artificial intelligence with Senior Cloud Engineer Nick Commisso. Hi Nick! Thanks for joining us today. So, let’s start right at the beginning. What is artificial intelligence?

    01:36
    Nick: Well, the ability of machines to imitate the cognitive abilities and problem solving capabilities of human intelligence can be classified as artificial intelligence or AI. 

    01:47

    Nikita: Now, when you say capabilities and abilities, what are you referring to?

    Nick: Human intelligence is the intellectual capability of humans that allows us to learn new skills through observation and mental digestion, to think through and understand abstract concepts and apply reasoning, to communicate using a language and understand the nonverbal cues, such as facial recognition, tone variation, and body language. 

    You can handle objections in real time, even in a complex setting. You can plan for short and long-term situations or projects. And, of course, you can create music and art or invent something new like an original idea. 

    If you can replicate any of these human capabilities in machines, this is artificial general intelligence or AGI. So in other words, AGI can mimic human sensory and motor skills, performance, learning, and intelligence, and use these abilities to carry out complicated tasks without human intervention. 

    When we apply AGI to solve problems with specific and narrow objectives, we call it artificial intelligence or AI. 

    02:55

    Lois: It seems like AI is everywhere, Nick. Can you give us some examples of where AI is used?

    Nick: AI is all around us, and you've probably interacted with AI, even if you didn't realize it. Some examples of AI can be viewing an image or an object and identifying if that is an apple or an orange. It could be examining an email and classifying it spam or not. It could be writing computer language code or predicting the price of an older car. 

    So let's get into some more specifics of AI tasks and the nature of related data. Machine learning, deep learning, and data science are all associated with AI, and it can be confusing to distinguish. 

    03:36

    Nikita: Why do we need AI? Why’s it important? 

    Nick: AI is vital in today's world, and with the amount of data that's generated, it far exceeds the human ability to absorb, interpret, and actually make decisions based on that data. That's where AI comes in handy by enhancing the speed and effectiveness of human efforts. 

    So here are two major reasons why we need AI. Number one, we want to eliminate or reduce the amount of routine tasks, and businesses have a lot of routine tasks that need to be done in large numbers. So things like approving a credit card or a bank loan, processing an insurance claim, recommending products to customers are just some example of routine tasks that can be handled. 

    And second, we, as humans, need a smart friend who can create stories and poems, designs, create code and music, and have humor, just like us. 

    04:33

    Lois: I’m onboard with getting help from a smart friend! There are different domains in AI, right, Nick? 

    Nick: We have language for language translation; vision, like image classification; speech, like text to speech; product recommendations that can help you cross-sell products; anomaly detection, like detecting fraudulent transactions; learning by reward, like self-driven cars. You have forecasting with weather forecasting. And, of course, generating content like image from text. 

    05:03

    Lois: There are so many applications. Nick, can you tell us more about these commonly used AI domains like language, audio, speech, and vision?

    Nick: Language-related AI tasks can be text related or generative AI. Text-related AI tasks use text as input, and the output can vary depending on the task. Some examples include detecting language, extracting entities in a text, or extracting key phrases and so on. 

    Consider the example of translating text. There's many text translation tools where you simply type or paste your text into a given text box, choose your source and target language, and then click translate. 

    Now, let's look at the generative AI tasks. They are generative, which means the output text is generated by a model. Some examples are creating text like stories or poems, summarizing a text, answering questions, and so on. Let's take the example of ChatGPT, the most well-known generative chat bot. These bots can create responses from their training on large language models, and they continuously grow through machine learning. 

    06:10

    Nikita: What can you tell us about using text as data?

    Nick: Text is inherently sequential, and text consists of sentences. Sentences can have multiple words, and those words need to be converted to numbers for it to be used to train language models. This is called tokenization. Now, the length of sentences can vary, and all the sentences lengths need to be made equal. This is done through padding. 

    Words can have similarities with other words, and sentences can also be similar to other sentences. The similarity can be measured through dot similarity or cosine similarity. We need a way to indicate that similar words or sentences may be close by. This is done through representation called embedding. 

    06:56
    Nikita: And what about language AI models?

    Nick: Language AI models refer to artificial intelligence models that are specifically designed to understand, process, and generate natural language. These models have been trained on vast amounts of textual data that can perform various natural language processing or NLP tasks. 

    The task that needs to be performed decides the type of input and output. The deep learning model architectures that are typically used to train models that perform language tasks are recurrent neural networks, which processes data sequentially and stores hidden states, long short-term memory, which processes data sequentially that can retain the context better through the use of gates, and transformers, which processes data in parallel. It uses the concept of self-attention to better understand the context. 

    07:48

    Lois: And then there’s speech-related AI, right?

    Nick: Speech-related AI tasks can be either audio related or generative AI. Speech-related AI tasks use audio or speech as input, and the output can vary depending on the task. For example, speech-to-text conversion or speaker recognition, voice conversion, and so on. Generative AI tasks are generative in nature, so the output audio is generated by a model. For example, you have music composition and speech synthesis. 
    Audio or speech is digitized as snapshots taken in time. The sample rate is the number of times in a second an audio sample is taken. Most digital audio have a sampling rate of 44.1 kilohertz, which is also the sampling rate for audio CDs. 
    Multiple samples need to be correlated to make sense of the data. For example, listening to a song for a fraction of a second, you won't be able to infer much about the song, and you'll probably need to listen to it a little bit longer. 

    Audio and speech AI models are designed to process and understand audio data, including spoken language. These deep-learning model architectures are used to train models that perform language with tasks-- recurrent neural networks, long short-term memory, transformers, variational autoencoders, waveform models, and Siamese networks. All of the models take into consideration the sequential nature of audio. 

    09:21
    Did you know that Oracle University offers free courses on Oracle Cloud Infrastructure? You’ll find training on everything from cloud computing, database, and security to artificial intelligence and machine learning, all free to subscribers. So, what are you waiting for? Pick a topic, leverage the Oracle University Learning Community to ask questions, and then sit for your certification. Visit mylearn.oracle.com to get started. 

    09:49

    Nikita: Welcome back! Now that we’ve covered language and speech-related tasks, let’s move on to vision-related tasks.
    Nick: Vision-related AI tasks could be image related or generative AI. Image-related AI tasks will use an image as an input, and the output depends on the task. Some examples are classifying images, identifying objects in an image, and so on. Facial recognition is one of the most popular image-related tasks that is often used for surveillance and tracking of people in real time, and it's used in a lot of different fields, including security, biometrics, law enforcement, and social media. 
    For generative AI tasks, the output image is generated by a model. For example, creating an image from a contextual description, generating images of a specific style or a high resolution, and so on. It can create extremely realistic new images and videos by generating original 3D models of an object, machine components, buildings, medication, people, and even more. 

    10:53

    Lois: So, then, here again I need to ask, how do images work as data?

    Nick: Images consist of pixels, and pixels can be either grayscale or color. And we can't really make out what an image is just by looking at one pixel. 

    The task that needs to be performed decides the type of input needed and the output produced. Various architectures have evolved to handle this wide variety of tasks and data. These deep-learning model architectures are typically used to train models that perform vision tasks-- convolutional neural networks, which detects patterns in images; learning hierarchical representations of visual features; YOLO, which is You Only Look Once, processes the image and detects objects within the image; and then you have generative adversarial networks, which generates real-looking images. 

    11:43

    Nikita: Nick, earlier you mentioned other AI tasks like anomaly detection, recommendations, and forecasting. Could you tell us more about them?

    Nick: Anomaly detection. This is time-series data, which is required for anomaly detection, and it can be a single or multivariate for fraud detection, machine failure, etc. 
    Recommendations. You can recommend products using data of similar products or users. For recommendations, data of similar products or similar users is required. 

    Forecasting. Time-series data is required for forecasting and can be used for things like weather forecasting and predicting the stock price. 

    12:22

    Lois: Nick, help me understand the difference between artificial intelligence, machine learning, and deep learning. Let’s start with AI. 

    Nick: Imagine a self-driving car that can make decisions like a human driver, such as navigating traffic or detecting pedestrians and making safe lane changes. AI refers to the broader concept of creating machines or systems that can perform tasks that typically require human intelligence. Next, we have machine learning or ML. Visualize a spam email filter that learns to identify and move spam emails to the spam folder, and that's based on the user's interaction and email content. Now, ML is a subset of AI that focuses on the development of algorithms that enable machines to learn from and make predictions or decisions based on data. 

    To understand what an algorithm is in the context of machine learning, it refers to a specific set of rules, mathematical equations, or procedures that the machine learning model follows to learn from data and make predictions on. And finally, we have deep learning or DL. Think of an image recognition software that can identify specific objects or animals within images, such as recognizing cats in photos on the internet. DL is a subfield of ML that uses neural networks with many layers, deep neural networks, to learn and make sense of complex patterns in data. 

    13:51

    Nikita: Are there different types of machine learning?

    Nick: There are several types of machine learning, including supervised learning, unsupervised learning, and reinforcement learning. Supervised learning where the algorithm learns from labeled data, making predictions or classifications. Unsupervised learning is an algorithm that discovers patterns and structures in unlabeled data, such as clustering or dimensionality reduction. And then, you have reinforcement learning, where agents learn to make predictions and decisions by interacting with an environment and receiving rewards or punishments. 

    14:27

    Lois: Can we do a deep dive into each of these types you just mentioned? We can start with the supervised machine learning algorithm.

    Nick: Let's take an example of how a credit card company would approve a credit card. Once the application and documents are submitted, a verification is done, followed by a credit score check and another 10 to 15 days for approval. And how is this done? Sometimes, purely manually or by using a rules engine where you can build rules, give new data, get a decision. 
    The drawbacks are slow. You need skilled people to build and update rules, and the rules keep changing. The good thing is that the businesses had a lot of insight as to how the decisions were made. Can we build rules by looking at the past data? 
    We all learn by examples. Past data is nothing but a set of examples. Maybe reviewing past credit card approval history can help. Through a process of training, a model can be built that will have a specific intelligence to do a specific task. The heart of training a model is an algorithm that incrementally updates the model by looking at the data samples one by one. 
    And once it's built, the model can be used to predict an outcome on a new data. We can train the algorithm with credit card approval history to decide whether to approve a new credit card. And this is what we call supervised machine learning. It's learning from labeled data. 

    15:52

    Lois: Ok, I see. What about the unsupervised machine learning algorithm?

    Nick: Data does not have a specific outcome or a label as we know it. And sometimes, we want to discover trends that the data has for potential insights. Similar data can be grouped into clusters. For example, retail marketing and sales, a retail company may collect information like household size, income, location, and occupation so that the suitable clusters could be identified, like a small family or a high spender and so on. And that data can be used for marketing and sales purposes. 
    Regulating streaming services. A streaming service may collect information like viewing sessions, minutes per session, number of unique shows watched, and so on. That can be used to regulate streaming services. Let's look at another example. We all know that fruits and vegetables have different nutritional elements. But do we know which of those fruits and vegetables are similar nutritionally? 

    For that, we'll try to cluster fruits and vegetables' nutritional data and try to get some insights into it. This will help us include nutritionally different fruits and vegetables into our daily diets. Exploring patterns and data and grouping similar data into clusters drives unsupervised machine learning. 

    17:13

    Nikita: And then finally, we come to the reinforcement learning algorithm. 

    Nick: How do we learn to play a game, say, chess? We'll make a move or a decision, check to see if it's the right move or feedback, and we'll keep the outcomes in your memory for the next step you take, which is learning. Reinforcement learning is a machine learning approach where a computer program learns to make decisions by trying different actions and receiving feedback. It teaches agents how to solve tasks by trial and error. This approach is used in autonomous car driving and robots as well. 

    17:46

    Lois: We keep coming across the term “deep learning.” You’ve spoken a bit about it a few times in this episode, but what is deep learning, really? How is it related to machine learning?

    Nick: Deep learning is all about extracting features and rules from data. Can we identify if an image is a cat or a dog by looking at just one pixel? Can we write rules to identify a cat or a dog in an image? Can the features and rules be extracted from the raw data, in this case, pixels? 

    Deep learning is really useful in this situation. It's a special kind of machine learning that trains super smart computer networks with lots of layers. And these networks can learn things all by themselves from pictures, like figuring out if a picture is a cat or a dog. 

    18:28

    Lois: I know we’re going to be covering this in detail in an upcoming episode, but before we let you go, can you briefly tell us about generative AI?

    Nick: Generative AI, a subset of machine learning, creates diverse content like text, audio, images, and more. These models, often powered by neural networks, learn patterns from existing data to craft fresh and creative output. For instance, ChatGPT generates text-based responses by understanding patterns in text data that it's been trained on. Generative AI plays a vital role in various AI tasks requiring content creation and innovation. 

    19:07

    Nikita: Thank you, Nick, for sharing your expertise with us. To learn more about AI, go to mylearn.oracle.com and search for the Oracle Cloud Infrastructure AI Foundations course. As you complete the course, you’ll find skill checks that you can attempt to solidify your learning. 

    Lois: And remember, the AI Foundations course on MyLearn also prepares you for the Oracle Cloud Infrastructure 2023 AI Foundations Associate certification. Both the course and the certification are free, so there’s really no reason NOT to take the leap into AI, right Niki?

    Nikita: That’s right, Lois!

    Lois: In our next episode, we will look at the fundamentals of machine learning. Until then, this is Lois Houston…

    Nikita: And Nikita Abraham signing off!

    19:52

    That’s all for this episode of the Oracle University Podcast. If you enjoyed listening, please click Subscribe to get all the latest episodes. We’d also love it if you would take a moment to rate and review us on your podcast app. See you again on the next episode of the Oracle University Podcast.

    Everything You Need to Know to Get Certified on Oracle Autonomous Database

    Everything You Need to Know to Get Certified on Oracle Autonomous Database
    How do I get certified in Oracle Autonomous Database? What material can I use to prepare for it? What's the exam like? How long is the certification valid for?
     
    If these questions have been keeping you up at night, then join Lois Houston and Nikita Abraham in their conversation with Senior Principal OCI Instructor Susan Jang to understand the process of getting certified and begin your learning adventure.
     
    Oracle MyLearn: mylearn.oracle.com/
    Oracle University Learning Community: education.oracle.com/ou-community
    X (formerly Twitter): twitter.com/Oracle_Edu
     
    Special thanks to Arijit Ghosh, David Wright, and the OU Studio Team for helping us create this episode.
     
    --------------------------------------------------------
     
    Episode Transcript
     

    00:00
    Welcome to the Oracle University Podcast, the first stop on your cloud journey. During this series of informative podcasts, we’ll bring you foundational training on the most popular Oracle technologies. Let’s get started!

    00:26
    Lois: Hello and welcome to the Oracle University Podcast. I’m Lois Houston, Director of Innovation Programs with Oracle University, and with me is Nikita Abraham, Principal Technical Editor.


    Nikita: Hi everyone! If you’ve listened to us these last few weeks, you’ll know we’ve been discussing Oracle Autonomous Database in detail. We looked at Autonomous Database on serverless and dedicated infrastructure.

    00:51

    Lois: That’s right, Niki. Then, last week, we explored Autonomous Database tools. Today, we thought we’d wrap up our focus on Autonomous Database by talking about the training offered by Oracle University, the associated certification, how to prepare for it, what you should do next, and more.

    Nikita: Yeah, we’ll get answers to all the big questions. And we’re going to get them from Susan Jang. Sue is a Senior Principal OCI Instructor with Oracle University. She has created and delivered training in Oracle databases and Oracle Cloud Infrastructure for over 20 years. Hi Sue! Thanks for joining us today.

    Sue: Happy to be here!

    01:29

    Lois: Sue, what training does Oracle have on Autonomous Database?
     
    Sue: Oracle University offers a professional-level course called the Oracle Autonomous Database Administration Workshop. So, if you want to learn to deploy and administer autonomous databases, this is the one for you. You’ll explore the fundamentals of the autonomous databases, their features, and benefits. You’ll learn about the technical architecture, the tasks that are involved in creating an autonomous database on a shared and on a dedicated Exadata infrastructure. You’ll discover what is the Machine Learning, you’ll discover what is APEX, which is Application Express, and SQL Developer Web, which is all deployed with the Autonomous Database. So basically everything you need to take your skills to the next level and become a proficient database administrator is in this course.

    02:28

    Nikita: Who can take this course, Sue? 
     
    Sue: The course is really for anyone interested in Oracle Autonomous Database, whether you’re a database administrator, a cloud data management professional, or a consultant.

    The topics in the course include everything from the features of an Autonomous Database through provisioning, managing, and monitor of the database.

    Most people think that just because it is an Autonomous Database, Oracle will do everything for you, and there is nothing a DBA can do or needs to do. But that’s not true.
     
    An Oracle Autonomous Database automates the day-to-day DBA tasks, like tuning the database to ensure it is running at performance level or that the backups are done successfully. By letting the Autonomous Database perform those tasks, it gives the database administrator time to fully understand the new features of an Oracle database and figure out how to implement the features that will benefit the DBA’s company.

    03:30

    Lois: Would a non-database administrator benefit from taking this course?
     
    Sue: Yes, Lois. Oracle courses are designed in modules, so you can focus on the modules that meet your needs. For example, if you’re a senior technical manager, you may not need to manage and monitor the Autonomous Database. But still, it’s important to understand its features and architecture to know how other Oracle products integrate with the database.

    03:57

    Nikita: Right. Talking about the course itself, each module consists of videos that teach different concepts, right?

    Sue: Yes, Niki. Each video covers one topic. A group of topics, or I should say a group of related topics, makes up a module.

    We know your time is important to you, and your success is important to us. You don’t just want to spend time taking training. You want to know that you’re really understanding the concepts of what you are learning. 

    So to help you do this, we have skill checks at the end of most modules. You must successfully answer 80% of the questions to pass these knowledge checks. These checks are an excellent way to ensure that you’re on the right track and have the understanding of each module before you move on to the next one. 

    04:48
     
    Lois: That’s great. And are there any other resources to help reinforce what’s been learned?
     
    Sue: I grew up with this phrase from my Mom. Education was her career. I remember hearing, “I hear and I forget. I see and I remember. I do and I understand.” It’s important to us that you understand the concepts and can actually “do” or “perform” the tasks. 

    You'll find several demos in the different modules of the Autonomous Database Administration Workshop. These videos are where the instructor shows you how to perform the tasks so you can reinforce what you learned in the lessons. You’ll find demos on provisioning an autonomous database, creating an autonomous database clone, and configuring disaster recovery, and lots more.
     
    Oracle also has what we call LiveLabs. These are a series of hands-on tutorials with step-by-step instructions to guide you through performing the tasks.

    05:49

    Nikita: I love the idea of LiveLabs. You can follow instructions on how to perform administrative tasks and then practice doing that on your own.

    Lois: Yeah, that’s fantastic. OK Sue, say I’ve taken the course. What do I do next? 


    Sue: Well, after you’ve taken the course, you’ll want to demonstrate your expertise with a certification. Because you want to get that better job. You want to increase your earning potential. You need to take the certification called the Oracle Autonomous Database Cloud Professional.

    We have a couple of resources to help you along the way to ensure you succeed in securing that certification.

    In MyLearn, the Oracle University online learning platform, you’ll see that the course, Oracle Autonomous Database Administration Workshop, falls within a learning path called Become an Oracle Autonomous Database Cloud Professional. The course is the first section of this learning path. The next section is a video describing the certification exam and how to prepare for it. The section after that is a practice exam. Now, though it doesn’t have the actual questions, you’ll find the exam will give you a good idea of the type of questions that will be asked in the exam. 

    07:10

    Lois: OK, so now I’ve done all that, and I’m ready to validate my knowledge and expertise. Tell me more about the certification, Sue.

    Sue: To get the certification, you must take an online exam. The duration of the exam is 90 minutes. It’s a Multiple Choice format, and there are 60 questions to the exam. 

    By getting this certification, you’re demonstrating to the world that you have the knowledge to provision, manage, and monitor, as well as migrate workloads to the Autonomous Database, on both a shared as well as a dedicated Exadata infrastructure. You will show you have the understanding of the architect of the Autonomous Database and can successfully use the features as well as its workflow, and you are capable of using Autonomous Database tools in developing an Autonomous Database.

    08:05

    Nikita: Great! So what do I need to do to take the exam?
    Sue: We assume you’ve already taken the course (making sure that you’re up to date with the training), that you’ve taken the time to study the topics in depth rather than memorizing superficial information just to pass the exam, looked at the available preparation material, and you’ve also taken the practice exam. I highly recommend that you have the hands-on experience or practice on an Autonomous Database before you take the certification exam.

    08:38
    Nikita: Hold on, Sue. You said to make sure we’re up to date with the training. How do I do that?

    Sue: Technology is ever-changing, and at Oracle, we continually enhance our products to provide features that make them faster or more straightforward to use. So, if you’re taking a course, you may find a small tag that says “New” next to a topic. That indicates that there are some new training that’s been added to the course. So what I’m trying to say is if you’re looking to take some certification, check the course before you register for the exam and to see if there are any “New” tags. If you find them, you can learn what’s new and not have to go through the entire course again. This way, you’re up to date with the training!

    09:25

    Nikita: Ok. Got it. Tell us more about the certification, Sue.
    Sue: If you’re ready, search for the Become An Oracle Autonomous Database Cloud Professional learning path in MyLearn and scroll down to the Oracle Database Cloud Professional exam. Click on the “Register Now” button. You’ll be taken to a page where you’ll see the exam overview, the resources to help you prepare for the exam, a button to register for the exam, and things to do before your exam session. It will also describe what happens after the exam and some exam policies, like what to do if you need to reschedule your exam.
    When you’re ready to take the exam, you can schedule the date and time according to when it’s convenient for you. 

    10:15

    Lois: What’s the actual experience of taking the exam like?

    Sue: It’s pretty straightforward. You want to prepare your system a day or two before the exam. You want to ensure you can connect successfully to the test site and that your laptop is plugged in and not running on battery. You want to make sure all other applications are closed before you perform the system test. Now, the system test is really with the test site and consists of testing your microphone, an internet speed test, and your video.

    You will also be asked to do a test-exam simulation. You will need to be able to download the simulation exam and answer a few simple true or false questions. Once you have successfully done that, you’re ready to take the test on your laptop on the actual day of the test.

    Now, on the day of the test, set up your test environment. For your test environment, what it really entails is that you have an environment that you do not have anything on your desk. You cannot have a second monitor. And it’s best to have a clear wall behind you so that the proctor can see there is nothing around you. And don’t forget to turn off your mobile device.

    11:34

    Lois: Ok, I’ve taken the test, and I passed. Wohoo! What happens now?

    Sue: When you pass the exam, you will receive an email from Oracle with your results as well as a link to Oracle CertView. This is the Oracle certification candidate portal. In CertView, you can download and print your eCertificate. You can share your newly earned badge on places like Facebook, Twitter, and LinkedIn, or even email your employer and others a secure link that they can use to confirm and validate your credentials.

    12:11

    Nikita: Can anyone take the certification?

    Sue: Yes, Niki. This certification is available to all candidates, including on-premise database administrators, cloud data management professionals, and consultants.

    12:24

    Lois: How long is the certification valid? What happens when it expires?

    Sue: Certain Oracle credentials require periodic recertification for Oracle to recognize them as "active." For such credentials, you must upgrade to a current version within 12 months following the Oracle credential retirement to keep your certification active.

    12:51

    Are you planning to become an Oracle Certified Professional this year? Whether you're a seasoned IT pro or just starting your career, getting certified can give you a significant boost. And don't worry, we've got your back. Join us at one of our cert prep live events in the Oracle University Learning Community. You'll get insider tips from seasoned experts and learn from other professionals' experiences. Plus, once you've earned your certification, you'll become part of our exclusive forum for Oracle-certified users. So, what are you waiting for? Head over to mylearn.oracle.com and create an account to jump-start your journey towards certification today!

    13:35

    Nikita: Welcome back. Sue, what other training can I take after Autonomous Database? 
     
    Sue: Now that you have a strong foundation in the database, there is so much more that you can learn in Oracle. You can consider Exadata if you work on a high-performance data workload that’s running mission-critical applications. Look for a learning path called Become an Exadata Service Cloud Administrator, in MyLearn, to help you with that. GoldenGate is also a good choice if you work with data that needs to be shared and replicated, both locally as well as globally. The course for this is called Oracle GoldenGate 19c: Administration/Implementation.  

    A hot topic in technology today is generative AI (Artificial Intelligence). You want to learn how to implement data security on different levels when it needs to be shared with large language model providers. 

    Perhaps venture beyond the database and learn about Oracle Cloud Infrastructure and how its components and the many cloud services work together. Just go to mylearn.oracle.com, and in the field where you see “What do you want to learn?” type in what interests you and let your learning adventure begin!

    14:59

    Lois: And since you brought up AI, Sue, this is the perfect time to mention that we’ll be focusing on it for the next couple of weeks. We’ll be speaking to some of our colleagues on topics like artificial intelligence, machine learning, deep learning, generative AI, the OCI AI portfolio and more, but we’ll talk more about that next week.

    Nikita: Yeah, can’t wait for that. Thank you so much, Sue, for giving us your time today.

    Sue: Thanks for having me!

    Lois: Until next time, this is Lois Houston…

    Nikita: And Nikita Abraham, signing off!

    15:29

    That’s all for this episode of the Oracle University Podcast. If you enjoyed listening, please click  Subscribe to get all the latest episodes. We’d also love it if you would take a moment to rate 
    and review us on your podcast app. See you again on the next episode of the Oracle University Podcast.

    Autonomous Database Tools

    Autonomous Database Tools
    In this episode, hosts Lois Houston and Nikita Abraham speak with Oracle Database experts about the various tools you can use with Autonomous Database, including Oracle Application Express (APEX), Oracle Machine Learning, and more.
     
     
    Oracle University Learning Community: https://education.oracle.com/ou-community
     
     
    X (formerly Twitter): https://twitter.com/Oracle_Edu
     
    Special thanks to Arijit Ghosh, David Wright, Tamal Chatterjee, and the OU Studio Team for helping us create this episode.
     
    ---------------------------------------------------------
     
    Episode Transcript:

    00:00

    Welcome to the Oracle University Podcast, the first stop on your cloud journey. During this series of informative podcasts, we’ll bring you foundational training on the most popular Oracle technologies. Let’s get started!

    00:26

    Lois: Hello and welcome to the Oracle University Podcast. I’m Lois Houston, Director of Innovation Programs with Oracle University, and with me is

    Nikita Abraham, Principal Technical Editor.

    Nikita: Hi everyone! We spent the last two episodes exploring Oracle Autonomous Database’s deployment options: Serverless and Dedicated. Today, it’s tool time!

    Lois: That’s right, Niki. We’ll be chatting with some of our Database experts on the tools that you can use with the Autonomous Database. We’re going to hear from Patrick Wheeler, Kay Malcolm, Sangeetha Kuppuswamy, and Thea Lazarova.

    Nikita: First up, we have Patrick, to take us through two important tools. Patrick, let’s start with Oracle Application Express. What is it and how does it help developers?

    01:15

    Patrick: Oracle Application Express, also known as APEX-- or perhaps APEX, we're flexible like that-- is a low-code development platform that enables you to build scalable, secure, enterprise apps with world-class features that can be deployed anywhere. Using APEX, developers can quickly develop and deploy compelling apps that solve real problems and provide immediate value. You don't need to be an expert in a vast array of technologies to deliver sophisticated solutions. Focus on solving the problem, and let APEX take care of the rest.

    01:52

    Lois: I love that it’s so easy to use. OK, so how does Oracle APEX integrate with Oracle Database? What are the benefits of using APEX on Autonomous Database?

    Patrick: Oracle APEX is a fully supported, no-cost feature of Oracle Database. If you have Oracle Database, you already have Oracle APEX. You can access APEX from database actions. Oracle APEX on Autonomous Database provides a preconfigured, fully managed, and secure environment to both develop and deploy world-class applications.
    Oracle takes care of configuration, tuning, backups, patching, encryption, scaling, and more, leaving you free to focus on solving your business problems. APEX enables your organization to be more agile and develop solutions faster for less cost and with greater consistency. You can adapt to changing requirements with ease, and you can empower professional developers, citizen developers, and everyone else.

    02:56

    Nikita: So you really don’t need to have a lot of specializations or be an expert to use APEX. That’s so cool! Now, what are the steps involved in creating an application using APEX? 

    Patrick: You will be prompted to log in as the administrator at first. Then, you may create workspaces for your respective users and log in with those associated credentials. Application Express provides you with an easy-to-use, browser-based environment to load data, manage database objects, develop REST interfaces, and build applications which look and run great on both desktop and mobile devices.

    You can use APEX to develop a wide variety of solutions, import spreadsheets, and develop a single source of truth in minutes. Create compelling data visualizations against your existing data, deploy productivity apps to elegantly solve a business need, or build your next mission-critical data management application. There are no limits on the number of developers or end users for your applications.

    04:01

    Lois: Patrick, how does APEX use SQL? What role does SQL play in the development of APEX applications? 

    Patrick: APEX embraces SQL. Anything you can express with SQL can be easily employed in an APEX application. Application Express also enables low-code development, providing developers with powerful data management and data visualization components that deliver modern, responsive end user experiences out-of-the-box. Instead of writing code by hand, you're able to use intelligent wizards to guide you through the rapid creation of applications and components.

    Creating a new application from APEX App Builder is as easy as one, two, three. One, in App Builder, select a project name and appearance. Two, add pages and features to the app. Three, finalize settings, and click Create.

    05:00

    Nikita: OK. So, the other tool I want to ask you about is Oracle Machine Learning. What can you tell us about it, Patrick?
    Patrick: Oracle Machine Learning, or OML, is available with Autonomous Database. A new capability that we've introduced with Oracle Machine Learning is called Automatic Machine Learning, or AutoML. Its goal is to increase data scientist productivity while reducing overall compute time. In addition, AutoML enables non-experts to leverage machine learning by not requiring deep understanding of the algorithms and their settings.

    05:37

    Lois: And what are the key functions of AutoML?
    Patrick: AutoML consists of three main functions: Algorithm Selection, Feature Selection, and Model Tuning. With Automatic Algorithm Selection, the goal is to identify the in-database algorithms that are likely to achieve the highest model quality. Using metalearning, AutoML leverages machine learning itself to help find the best algorithm faster than with exhaustive search.
    With Automatic Feature Selection, the goal is to denoise data by eliminating features that don't add value to the model. By identifying the most predicted features and eliminating noise, model accuracy can often be significantly improved with a side benefit of faster model building and scoring.

    Automatic Model Tuning tunes algorithm hyperparameters, those parameters that determine the behavior of the algorithm, on the provided data. Auto Model Tuning can significantly improve model accuracy while avoiding manual or exhaustive search techniques, which can be costly both in terms of time and compute resources.

    06:44

    Lois: How does Oracle Machine Learning leverage the capabilities of Autonomous Database?

    Patrick: With Oracle Machine Learning, the full power of the database is accessible with the tremendous performance of parallel processing available, whether the machine learning algorithm is accessed via native database SQL or with OML4Py through Python or R. 

    07:07

    Nikita: Patrick, talk to us about the Data Insights feature. How does it help analysts uncover hidden patterns and anomalies?
    Patrick: A feature I wanted to call the electromagnet, but they didn't let me. An analyst's job can often feel like looking for a needle in a haystack. So throw the switch and all that metallic stuff is going to slam up onto that electromagnet. Sure, there are going to be rusty old nails and screws and nuts and bolts, but there are going to be a few needles as well. It's far easier to pick the needles out of these few bits of metal than go rummaging around in a pile of hay, especially if you have allergies.

    That's more or less how our Insights tool works. Load your data, kick off a query, and grab a cup of coffee. Autonomous Database does all the hard work, scouring through this data looking for hidden patterns, anomalies, and outliers. Essentially, we run some analytic queries that predict expected values.
    And where the actual values differ significantly from expectation, the tool presents them here. Some of these might be uninteresting or obvious, but some are worthy of further investigation. You get this dashboard of various exceptional data patterns. Drill down on a specific gauge in this dashboard and significant deviations between actual and expected values are highlighted.

    08:28

    Lois: What a useful feature! Thank you, Patrick. Now, let’s discuss some terms and concepts that are applicable to the Autonomous JSON Database with Kay. Hi Kay, what’s the main focus of the Autonomous JSON Database? How does it support developers in building NoSQL-style applications?

    Kay: Autonomous Database supports the JavaScript Object Notation, also known as JSON, natively in the database. It supports applications that use the SODA API to store and retrieve JSON data or SQL queries to store and retrieve data stored in JSON-formatted data. 

    Oracle AJD is Oracle ATP, Autonomous Transaction Processing, but it's designed for developing NoSQL-style applications that use JSON documents. You can promote an AJD service to ATP.

    09:22

    Nikita: What makes the development of NoSQL-style, document-centric applications flexible on AJD? 

    Kay: Development of these NoSQL-style, document-centric applications is particularly flexible because the applications use schemaless data. This lets you quickly react to changing application requirements. There's no need to normalize the data into relational tables and no impediment to changing the data structure or organization at any time, in any way. A JSON document has its own internal structure, but no relation is imposed on separate JSON documents.

    Nikita: What does AJD do for developers? How does it actually help them?

    Kay: So Autonomous JSON Database, or AJD, is designed for you, the developer, to allow you to use simple document APIs and develop applications without having to know anything about SQL. That's a win.

    But at the same time, it does give you the ability to create highly complex SQL-based queries for reporting and analysis purposes. It has built-in binary JSON storage type, which is extremely efficient for searching and for updating. It also provides advanced indexing capabilities on the actual JSON data.

    It's built on Autonomous Database, so that gives you all of the self-driving capabilities we've been talking about, but you don't need a DBA to look after your database for you. You can do it all yourself.

    11:00

    Lois: For listeners who may not be familiar with JSON, can you tell us briefly what it is? 

    Kay: So I mentioned this earlier, but it's worth mentioning again. JSON stands for JavaScript Object Notation. It was originally developed as a human readable way of providing information to interchange between different programs.

    So a JSON document is a set of fields. Each of these fields has a value, and those values can be of various data types. We can have simple strings, we can have integers, we can even have real numbers. We can have Booleans that are true or false. We can have date strings, and we can even have the special value null.

    Additionally, values can be objects, and objects are effectively whole JSON documents embedded inside a document. And of course, there's no limit on the nesting. You can nest as far as you like. Finally, we can have a raise, and a raise can have a list of scalar data types or a list of objects.

    12:13

    Nikita: Kay, how does the concept of schema apply to JSON databases?

    Kay: Now, JSON documents are stored in something that we call collections. Each document may have its own schema, its own layout, to the JSON. So does this mean that JSON document databases are schemaless? Hmmm. Well, yes. But there's nothing to fear because you can always use a check constraint to enforce a schema constraint that you wish to introduce to your JSON data.

    Lois: Kay, what about indexing capabilities on JSON collections?

    Kay: You can create indexes on a JSON collection, and those indexes can be of various types, including our flexible search index, which indexes the entire content of the document within the JSON collection, without having to know anything in advance about the schema of those documents. 

    Lois: Thanks Kay!

    13:18

    AI is being used in nearly every industry—healthcare, manufacturing, retail, customer service, transportation, agriculture, you name it! And, it’s only going to get more prevalent and transformational in the future. So it’s no wonder that AI skills are the most sought after by employers. 

    We’re happy to announce a new OCI AI Foundations certification and course that is available—for FREE! Want to learn about AI? Then this is the best place to start! So, get going! Head over to mylearn.oracle.com to find out more. 

    13:54

    Nikita: Welcome back! Sangeetha, I want to bring you in to talk about Oracle Text. Now I know that Oracle Database is not only a relational store but also a document store. And you can load text and JSON assets along with your relational assets in a single database. 

    When I think about Oracle and databases, SQL development is what immediately comes to mind. So, can you talk a bit about the power of SQL as well as its challenges, especially in schema changes?

    Sangeetha: Traditionally, Oracle has been all about SQL development. And with SQL development, it's an incredibly powerful language. But it does take some advanced knowledge to make the best of it.

    So SQL requires you to define your schema up front. And making changes to that schema could be a little tricky and sometimes highly bureaucratic task. In contrast, JSON allows you to develop your schema as you go--the schemaless, perhaps schema-later model. By imposing less rigid requirements on the developer, it allows you to be more fluid and Agile development style.

    15:09

    Lois: How does Oracle Text use SQL to index, search, and analyze text and documents that are stored in the Oracle Database?

    Sangeetha: Oracle Text can perform linguistic analyses on documents as well as search text using a variety of strategies, including keyword searching, context queries, Boolean operations, pattern matching, mixed thematic queries, like HTML/XML session searching, and so on.

    It can also render search results in various formats, including unformatted text, HTML with term highlighting, and original document format. Oracle Text supports multiple languages and uses advanced relevance-ranking technology to improve search quality. Oracle Text also offers advantage features like classification, clustering, and support for information visualization metaphors.

    Oracle Text is now enabled automatically in Autonomous Database. It provides full-text search capabilities over text, XML, JSON content. It also could extend current applications to make better use of textual fields. It builds new applications specifically targeted at document searching.

    Now, all of the power of Oracle Database and a familiar development environment, rock-solid autonomous database infrastructure for your text apps, we can deal with text in many different places and many different types of text. So it is not just in the database. We can deal with data that's outside of the database as well.

    17:03

    Nikita: How does it handle text in various places and formats, both inside and outside the database?

    Sangeetha: So in the database, we can be looking a varchar2 column or LOB column or binary LOB columns if we are talking about binary documents such as PDF or Word. Outside of the database, we might have a document on the file system or out on the web with URLs pointing out to the document.

    If they are on the file system, then we would have a file name stored in the database table. And if they are on the web, then we should have a URL or a partial URL stored in the database. And we can then fetch the data from the locations and index it in the term documents format.

    We recognize many different document formats and extract the text from them automatically. So the basic forms we can deal with-- plain text, HTML, JSON, XML, and then formatted documents like Word docs, PDF documents, PowerPoint documents, and also so many different types of documents. All of those are automatically handled by the system and then processed into the format indexing.

    And we are not restricted by the English either here. There are various stages in the index pipeline. A document starts one, and it's taken through the different stages so until it finally reaches the index.

    18:44

    Lois: You mentioned the indexing pipeline. Can you take us through it?

    Sangeetha: So it starts with a data store. That's responsible for actually reaching the document. So once we fetch the document from the data store, we pass it on to the filter. And now the filter is responsible for processing binary documents into indexable text.

    So if you have a PDF, let's say a PDF document, that will go through the filter. And that will extract any images and return it into the stream of HTML text ready for indexing. Then we pass it on to the sectioner, which is responsible for identifying things like paragraphs and sentences. The output from the section is fed onto the lexer.

    The lexer is responsible for dividing the text into indexable words. The output of the lexer is fed into the index engine, which is responsible for laying out to the indexes on the disk. Storage, word list, and stop list are some additional inputs there.
    So storage tells exactly how to lay out the index on disk. Word list which has special preferences like desegmentation. And then stop is a list word that we don't want to index. So each of these stages and inputs can be customized.

    Oracle has something known as the extensibility framework, which originally was designed to allow people to extend capabilities of these products by adding new domain indexes. And this is what we've used to implement Oracle Text. So when kernel sees this phrase INDEXTYPE ctxsys.context, it knows to handle all of the hard work creating the index.

    20:48

    Nikita: Other than text indexing, Oracle Text offers additional operations, right? Can you share some examples of these operations?

    Sangeetha: So beyond the text index, other operations that we can do with the Oracle Text, some of which are search related. And some examples of that are these highlighting markups and snippets. Highlighting and markup are very similar. They are ways of fetching these results back with the search. And then it's marked up with highlighting within the document text.
    Snippet is very similar, but it's only bringing back the relevant chunks from the document that we are searching for. So rather than getting the whole document back to you, just get a few lines showing this in a context and the theme and extraction. So Oracle Text is capable of figuring out what a text is all about. We have a very large knowledge base of the English language, which will allow you to understand the concepts and the themes in the document.

    Then there's entity extraction, which is the ability to find out people, places, dates, times, zip codes, et cetera in the text. So this can be customized with your own user dictionary and your own user rules.

    22:14

    Lois: Moving on to advanced functionalities, how does Oracle Text utilize machine learning algorithms for document classification? And what are the key types of classifications?
    Sangeetha: The text analytics uses machine learning algorithms for document classification. We can process a large set of data documents in a very efficient manner using Oracle's own machine learning algorithms. So you can look at that as basically three different headings. First of all, there's classification. And that comes in two different types-- supervised and unsupervised.

    The supervised classification which means in this classification that it provides the training set, a set of documents that have already defined particular characteristics that you're looking for. And then there's unsupervised classification, which allows your system itself to figure out which documents are similar to each other.

    It does that by looking at features within the documents. And each of those features are represented as a dimension in a massively high dimensional feature space in documents, which are clustered together according to that nearest and nearness in the dimension in the feature space.

    Again, with the named entity recognition, we've already talked about that a little bit. And then finally, there is a sentiment analysis, the ability to identify whether the document is positive or negative within a given particular aspect.

    23:56

    Nikita: Now, for those who are already Oracle database users, how easy is it to enable text searching within applications using Oracle Text?

    Sangeetha: If you're already an Oracle database user, enabling text searching within your applications is quite straightforward. Oracle Text uses the same SQL language as the database. And it integrates seamlessly with your existing SQL. Oracle Text can be used from any programming language which has SQL interface, meaning just about all of them. 

    24:32

    Lois: OK from Oracle Text, I’d like to move on to Oracle Spatial Studio. Can you tell us more about this tool?

    Sangeetha: Spatial Studio is a no-code, self-service application that makes it easy to access the sorts of spatial features that we've been looking at, in particular, in order to get that data prepared to use with spatial, visualizing results in maps and tables, and also doing the analysis and sharing results. Spatial Studios is encoded at no extra cost with Autonomous Database. The studio web application itself has no additional cost and it runs on the server.

    25:13

    Nikita: Let’s talk a little more about the cost. How does the deployment of Spatial Studio work, in terms of the server it runs on? 

    Sangeetha: So, the server that it runs on, if it's running in the Cloud, that computing node, it would have some cost associated with it. It can also run on a free tier with a very small shape, just for evaluation and testing. 

    Spatial Studio is also available on the Oracle Cloud Marketplace. And there are a couple of self-paced workshops that you can access for installing and using Spatial Studio.

    25:47

    Lois: And how do developers access and work with Oracle Autonomous Database using Spatial Studio?

    Sangeetha: Oracle Spatial Studio allows you to access data in Oracle Database, including Oracle Autonomous Database. You can create connections to Oracle Autonomous Databases, and then you work with the data that's in the database. You can also see Spatial Studio to load data to Oracle Database, including Oracle Autonomous Database.

    So, you can load these spreadsheets in common spatial formats. And once you've loaded your data or accessed data that already exists in your Autonomous Database, if that data does not already include native geometrics, Oracle native geometric type, then you can prepare the data if it has addresses or if it has latitude and longitude coordinates as a part of the data.

    26:43

    Nikita: What about visualizing and analyzing spatial data using Spatial Studio?

    Sangeetha: Once you have the data prepared, you can easily drag and drop and start to visualize your data, style it, and look at it in different ways. And then, most importantly, you can start to ask spatial questions, do all kinds of spatial analysis, like we've talked about earlier.

    While Spatial Studio provides a GUI that allows you to perform those same kinds of spatial analysis. And then the results can be dropped on the map and visualized so that you can actually see the results of spatial questions that you're asking. When you've done some work, you can save your work in a project that you can return to later, and you can also publish and share the work you've done.

    27:34

    Lois: Thank you, Sangeetha. For the final part of our conversation today, we’ll talk with Thea. Thea, thanks so much for joining us. Let's get the basics out of the way. How can data be loaded directly into Autonomous Database?
    Thea: Data can be loaded directly to ADB through applications such as SQL Developer, which can read data files, such as txt and xls, and load directly into tables in ADB.

    27:59

    Nikita: I see. And is there a better method to load data into ADB?
    Thea: A more efficient and preferred method for loading data into ADB is to stage the data cloud object store, preferably Oracle's, but also supported our Amazon S3 and Azure Blob Storage. Any file type can be staged in object store. Once the data is in object store, Autonomous Database can access a directly. Tools can be used to facilitate the data movement between object store and the database.

    28:27

    Lois: Are there specific steps or considerations when migrating a physical database to Autonomous?

    Thea: A physical database can simply be migrated to autonomous because database must be converted to pluggable database, upgraded to 19C, and encrypted. Additionally, any changes to an Oracle-shipped stored procedures or views must be found and reverted. All uses of container database admin privileges must be removed. And all legacy features that are not supported must be removed, such as legacy LOBs.
    Data Pump, expdp/impdp must be used for migrating databases versions 10.1 and above to Autonomous Database as it addresses the issues just mentioned. For online migrations, GoldenGate must be used to keep old and new database in sync.

    29:15

    Nikita: When you’re choosing the method for migration and loading, what are the factors to keep in mind?

    Thea: It's important to segregate the methods by functionality and limitations of use against Autonomous Database. The considerations are as follows. Number one, how large is the database to be imported? Number two, what is the input file format? Number three, does the method support non-Oracle database sources? And number four, does the methods support using Oracle and/or third-party object store?

    29:45

    Lois: Now, let’s move on to the tools that are available. What does the DBMS_CLOUD functionality do?

    Thea: The Oracle Autonomous Database has built-in functionality called DBMS_CLOUD specifically designed so the database can move data back and forth with external sources through a secure and transparent process. DBMS_CLOUD allows data movement from the Oracle object store. Data from any application or data source export to text-- .csv or JSON-- output from third-party data integration tools.

    DBMS_CLOUD can also access data stored on Object Storage from the other clouds, AWS S3 and Azure Blob Storage. DBMS_CLOUD does not impose any volume limit, so it's the preferred method to use. SQL*Loader can be used for loading data located on the local client file systems into Autonomous Database. There are limits around OS and client machines when using SQL*Loader.

    30:49

    Nikita: So then, when should I use Data Pump and SQL Developer for migration?

    Thea: Data Pump is the best way to migrate a full or part database into ADB, including databases from previous versions. Because Data Pump will perform the upgrade as part of the export/import process, this is the simplest way to get to ADB from any existing Oracle Database implementation. SQL Developer provides a GUI front end for using data pumps that can automate the whole export and import process from an existing database to ADB.

    SQL Developer also includes an import wizard that can be used to import data from several file types into ADB. A very common use of this wizard is for importing Excel files into ADW. Once a credential is created, it can be used to access a file as an external table or to ingest data from the file into a database table. DBMS_CLOUD makes it much easier to use external tables, and the organization external needed in other versions of the Oracle Database are not needed.

    31:54

    Lois: Thea, what about Oracle Object Store? How does it integrate with Autonomous Database, and what advantages does it offer for staging data?

    Thea: Oracle Object Store is directly integrated into Autonomous Database and is the best option for staging data that will be consumed by ADB. Any file type can be stored in object store, including SQL*Loader files, Excel, JSON, Parquet, and, of course, Data Pump DMP files. Flat files stored on object store can also be used as Oracle Database external tables, so they can queried directly from the database as part of a normal DML operation.

    Object store is a separate bin storage allocated to the Autonomous Database for database Object Storage, such as tables and indexes. That storage is part of the Exadata system Autonomous Database runs on, and it is automatically allocated and managed. Users do not have direct access to that storage.

    32:50

    Nikita: I know that one of the main considerations when loading and updating ADB is the network latency between the data source and the ADB. Can you tell us more about this?

    Thea: Many ways to measure this latency exist. One is the website cloudharmony.com, which provides many real-time metrics for connectivity between the client and Oracle Cloud Services. It's important to run these tests when determining with Oracle Cloud service location will provide the best connectivity.

    The Oracle Cloud Dashboard has an integrated tool that will provide real time and historic latency information between your existing location and any specified Oracle Data Center. When migrating data to Autonomous Database, table statistics are gathered automatically during direct-path load operations. If direct-path load operations are not used, such as with SQL Developer loads, the user can gather statistics manually as needed.

    33:44

    Lois: And finally, what can you tell us about the Data Migration Service?

    Thea: Database Migration Service is a fully managed service for migrating databases to ADB. It provides logical online and offline migration with minimal downtime and validates the environment before migration. We have a requirement that the source database is on Linux. And it would be interesting to see if we are going to have other use cases that we need other non-Linux operating systems.

    This requirement is because we are using SSH to directly execute commands on the source database. For this, we are certified on the Linux only. Target in the first release are Autonomous databases, ATP, or ADW, both serverless and dedicated. For agent environment, we require Linux operating system, and this is Linux-safe. In general, we're targeting a number of different use cases-- migrating from on-premise, third-party clouds, Oracle legacy clouds, such as Oracle Classic, or even migrating within OCI Cloud and doing that with or without direct connection.

    If you have any direct connection behind a firewall, we support offline migration. If you have a direct connection, we support both offline and online migration. For more information on all migration approaches are available for your particular situation, check out the Oracle Cloud Migration Advisor.

    35:06

    Nikita: I think we can wind up our episode with that. Thanks to all our experts for giving us their insights. 

    Lois: To learn more about the topics we’ve discussed today, visit mylearn.oracle.com and search for the Oracle Autonomous Database Administration Workshop. Remember, all of the training is free, so dive right in! Join us next week for another episode of the Oracle University Podcast. Until then, Lois Houston…

    Nikita: And Nikita Abraham, signing off!

    35:35

    That’s all for this episode of the Oracle University Podcast. If you enjoyed listening, please click Subscribe to get all the latest episodes. We’d also love it if you would take a moment to rate and review us on your podcast app. See you again on the next episode of the Oracle University Podcast.

    Autonomous Database on Dedicated Infrastructure

    Autonomous Database on Dedicated Infrastructure
    The Oracle Autonomous Database Dedicated deployment is a good choice for customers who want to implement a private database cloud in their own dedicated Exadata infrastructure. That dedicated infrastructure can either be in the Oracle Public Cloud or in the customer's own data center via Oracle Exadata Cloud@Customer.
     
    In a dedicated environment, the Exadata infrastructure is entirely dedicated to the subscribing customer, isolated from other cloud tenants, with no shared processor, storage, and memory resource.
     
    In this episode, hosts Lois Houston and Nikita Abraham speak with Oracle Database experts about how Autonomous Database Dedicated offers greater control of the software and infrastructure life cycle, customizable policies for separation of database workload, software update schedules and versioning, workload consolidation, availability policies, and much more.
     
    Oracle University Learning Community: https://education.oracle.com/ou-community
    X (formerly Twitter): https://twitter.com/Oracle_Edu
     
    Special thanks to Arijit Ghosh, David Wright, Tamal Chatterjee, and the OU Studio Team for helping us create this episode.
     
    -------------------------------------------------------
     
    Episode Transcript:
     

    00:00
    Welcome to the Oracle University Podcast, the first stop on your cloud journey. During this series of informative podcasts, we’ll bring you foundational training on the most popular Oracle technologies. Let’s get started.
    00:26
    Nikita: Hello and welcome to the Oracle University Podcast. I’m Nikita Abraham, Principal Technical Editor with Oracle University, and I’m joined by Lois Houston, Director of Innovation Programs.
    Lois: Hi there! This is our second episode on Oracle’s Autonomous Database, and today we’re going to spend time discussing Autonomous Database on Dedicated Infrastructure. We’ll be talking with three of our colleagues: Maria Colgan, Kamryn Vinson, and Kay Malcolm.
    00:53
    Nikita: Maria is a Distinguished Product Manager for Oracle Database, Kamryn is a Database Product Manager, and Kay is a Senior Director of Database Product Management. 
    Lois: Hi Maria! Thanks for joining us today. We know that Oracle Autonomous Database offers two deployment choices: serverless and dedicated Exadata infrastructure. We spoke about serverless infrastructure last week but for anyone who missed that episode, can you give us a quick recap of what it is?
    01:22
    Maria: With Autonomous Database Serverless, Oracle automates all aspects of the infrastructure and database management for you. That includes provisioning, configuring, monitoring, backing up, and tuning. You simply select what type of database you want, maybe a data warehouse, transaction processing, or a JSON document store, which region in the Oracle Public Cloud you want that database deployed, and the base compute and storage resources necessary. Oracle automatically takes care of everything else. Once provisioned, the database can be instantly scaled through our UI, our APIs, or automatically based on your workload needs. All scaling activities happen completely online while the database remains open for business.
    02:11
    Nikita: Ok, so now that we know what serverless is, let’s move on to dedicated infrastructure. What can you tell us about it?
    Maria: Autonomous Database Dedicated allows customers to implement a private database cloud running on their own dedicated Exadata infrastructure. That dedicated infrastructure can be in Oracle’s Public Cloud or in the customer's own data center via Oracle Exadata Cloud@Customer. It makes an ideal platform to consolidate multiple databases regardless of their workload type or their size. And it also allows you to offer database as a service within your enterprise.
    02:50
    Lois: What are the primary benefits of Autonomous Database Dedicated infrastructure?
    Maria: With the dedicated deployment option, you must first subscribe to Dedicated Exadata Cloud Infrastructure that is isolated from other tenants with no shared processors, memory, network, or storage resources.
    This infrastructure choice offers greater control of both the software and the infrastructure life cycle. Customers can specify their own policies for workload separation, software update schedules, and availability. One of the key benefits of an autonomous database is a lower total cost of ownership through more automation and operational delegation to Oracle. Remember it’s a fully managed service. All database operations, such as backup, software updates, upgrades, OS maintenance, incident management, and health monitoring, will be automatically done for you by Oracle. Its maximum availability architecture protects you from any hardware failures and in the event of a full outage, the service will be automatically failed over to your standby site. Built-in application continuity ensures zero downtime during the standard software update or in the event of a failover. 
    04:09
    Nikita: And how is this billed? 
    Maria: Autonomous Database also has true pay-per-use billing so even when autoscale is enabled, you’ll only pay for those additional resources when you use them. And we make it incredibly simple to develop on this environment with managed developer add-ons like our low code development environment, APEX, and our REST data services. This means you don’t need any additional development environments in order to get started with a new application.
    04:40
    Lois: Ok. So, it looks like the dedicated option offers more control and customization. Maria, how do we access a dedicated database over a network?
    Maria: The network path is through a VCN, or Virtual Cloud Network, and the subnet that's defined by the Exadata infrastructure hosting the database. By default, this subnet is defined as private, meaning, there's no public internet access to those databases. This ensures only your company can access your Exadata infrastructure and your databases.
    Autonomous Database Dedicated can also take advantage of network services provided by OCI, including subnets or VCN peering, as well as connections to on-prem databases through the IP secure VPN and FastConnect dedicated corporate network connections.
    05:33
    Maria: You can also take advantage of the Oracle Microsoft partnership that enables customers to connect their Oracle Cloud Infrastructure resources and Microsoft Azure resources through a dedicated private connection. However, for some customers, a move to the public cloud is just not possible. Perhaps it's due to industry regulations, performance concerns, or integration with legacy on-prem applications. For these types of customers, Exadata Cloud@Customer should meet their requirements for strict data sovereignty and security by delivering high-performance Exadata Cloud Services capabilities in their data center behind their own firewall.
    06:16
    Nikita: What are the benefits of Autonomous Database on Exadata Cloud@Customer? How’s it different?
    Maria: Autonomous Database on Exadata Cloud@Customer provides the same service as Autonomous Database Dedicated in the public cloud.
    So you get the same simplicity, agility, and performance, and elasticity that you get in the cloud. But it also provides a very fast and simple transition to an autonomous cloud because you can easily migrate on-prem databases to Exadata Cloud@Customer. Once the database is migrated, any existing applications can simply reconnect to that new database and run without any application changes being needed. And the data will leave your data center, so making it a very safe way to adopt a cloud model.
    07:04
    Lois: So, how do we manage communication to and from the public cloud?
    Maria: Each Cloud@Customer rack includes two local control plane servers to manage the communication to and from the public cloud. The local control plane acts on behalf of requests from the public cloud, keeping communications consolidated and secure. Platform control plane commands are sent to the Exadata Cloud@Customer system through a dedicated WebSocket secure tunnel. 
    Oracle Cloud operations staff use that same tunnel to monitor the autonomous database on Exadata Cloud@Customer both for maintenance and for troubleshooting. The two remote, control plane servers installed in the Exadata Cloud@Customer rack host that secure tunnel endpoint and act as a gateway for access to the infrastructure. They also host components that orchestrate the cloud automation, aggregates and routes telemetry messages from the Exadata Cloud@Customer platform to the Oracle Support Service infrastructure. And they also host images for server patching.
    08:13
    Maria: The Exadata Database Server is connected to the customer-managed switches via either 10 gigabit or 25 gigabit Ethernet. Customers have access to the customer Virtual Machine, or VM, via a pair of layer 2 network connections that are implemented as Virtual Network Interface Cards, or vNICs. They're also tagged VLAN. The physical network connections are implemented for high availability in an active standby configuration.
    Autonomous Database on Exadata Cloud@Customer provides the best of both worlds-- all of the automation including patching, backing up, scaling, and management of a database that you get with a cloud service, but without the data ever leaving the customer's data center.
    09:01
    Nikita: That's interesting. And, what happens if a dedicated database loses network connectivity to the OCI control plane?
    Maria: In the event an autonomous database on Exadata Cloud@Customer loses network connectivity to the OCI control plane, the Autonomous Database will actually continue to be available for your applications. And operations such as backups and autoscaling will not be impacted in that loss of network connectivity.
    However, the management and monitoring of the Autonomous Database via the OCI console and APIs as well as access by the Oracle Cloud operations team will not be available until that network is reconnected.
    09:43
    Maria: The capability suspended in the case of a lost network connection include, as I said, infrastructure management-- so that's the manual scaling of an Autonomous Database via the UI or our OCI CLI, or REST APIs, as well as Terraform scripts. They won't be available. Neither will the ability for Oracle Cloud ops to access and perform maintenance activities, such as patching. Nor will we be able to monitor the Oracle infrastructure during the time where the system is not connected.
    10:20
    Lois: That’s good to know, Maria. What about data encryption and backup options?
    Maria: All Oracle Autonomous Databases encrypt data at REST. Data is automatically encrypted as it's written to the storage. But this encryption is transparent to authorized users and applications because the database automatically decrypts the data when it's being read from the storage. There are several options for backing up the Autonomous Database Cloud@Customer including using a Zero Data Loss Recovery Appliance, or ZDLRA. You can back it up to locally mounted NFS storage or back it up to the Oracle Public Cloud.
    10:57
    Nikita: I want to ask you about the typical workflow for Autonomous Database Dedicated infrastructure. What are the main steps here?
    Maria: In the typical workflow, the fleet administrator role performs the following steps. They provision the Exadata infrastructure by specifying its size, availability domain, and region within the Oracle Cloud. Once the hardware has been provisioned, the fleet administrator partitions the system by provisioning clusters and container databases. Then the developers, DBAs, or anyone who needs a database can provision databases within those container databases.
    Billing is based on the size of the Exadata infrastructure that's provisioned. So whether that's a quarter rack, half rack, or full rack. It also depends on the number of CPUs that are being consumed. Remember, it's also possible for customers to use their existing Oracle database licenses with this service to reduce the cost.
    11:53
    Lois: And what Exadata infrastructure models and shapes does Autonomous Database Dedicated support?
    Maria: That's the X7, X8, and X8M and you can get all of those in either a quarter, half, or full Exadata rack. Currently, you can create a maximum of 12 VM clusters on an Autonomous Database Dedicated infrastructure.
    We also advise that you limit the number of databases you provision to meet your preferred SLA. To meet the high availability SLA, we recommend a maximum of 100 databases. To meet the extreme availability SLA, we recommend a maximum of 25 databases.
    12:35
    Nikita: Ok, so now that I know all this, how do I actually get started with Autonomous Database on dedicated infrastructure?
    Maria: You need to increase your service limit to include that Exadata infrastructure and then you need to create the fleet and DBA service roles. You also need to create the necessary network model, VM clusters, and container databases for your organization.
    Finally, you need to provide access to the end users who want to create and use those Autonomous databases. Autonomous Database requires a subscription to that Exadata infrastructure for a minimum of 48 hours. But once subscribed, you can test out ideas and then terminate the subscription with no ongoing costs. While subscribed, you can control where you place the resources to perhaps manage latency sensitive applications.
    13:29
    Maria: You can also have control over patching schedules, software versions, so you can be sure that you're testing exactly what you need to. You can also migrate databases to the Autonomous Database via our export, import capabilities via the object store or through Data Pump or Golden Gate. As with any Autonomous Database, once it's provisioned, you've got full access to both autoscaling and all our cloning capabilities. 
    13:57
    Lois: Maria, I've heard you talk about the importance of clean role separation in managing a private cloud. Can you elaborate on that, please?
    Maria: A successful private cloud is set up and managed using clean role separation between the fleet administration group and the developers, or DBA groups. The fleet administration group establishes the governance constraints, including things like budgeting, capacity compliance, and SLAs, according to the business structure. The physical resources are also logically grouped to align with this business structure, and then groups of users are given self-service access to the resources within these groups. So a good example of this would be that the developers and DBA groups use self-service database resources within these constraints.
    14:46
    Nikita: I see. So, what exactly does a fleet administrator do?
    Maria: Fleet administrators allocate budget by department and are responsible for the creation, monitoring, and management of the autonomous exadata infrastructure, the autonomous exadata VM clusters, and the autonomous container databases. To perform these duties, the fleet administrators must have an Oracle Cloud account or user, and that user must have permissions to manage these resources and be permitted to use network resources that need to be specified when you create these other resources.
    15:24
    Nikita: And what about database administrators?

    Maria: Database administrators create, monitor, and manage autonomous databases. They, too, need to have an Oracle Cloud account or be an Oracle Cloud user. Now, those accounts need to have the necessary permissions in order to create and access databases. They also need to be able to access autonomous backups and have permission to access the autonomous container databases, inside which these autonomous databases will be created, and have all of the necessary permissions to be able to create those databases, as I said.
    While creating autonomous databases, the database administrators will define and gain access to an admin user account inside the database. It's through this account that they will actually get the necessary permissions to be able to create and control database users. 
    16:24
    Lois: How do developers fit into the picture?
    Maria: Database users and developers who write applications that will use or access an autonomous database don't actually need Oracle Cloud accounts. They'll actually be given the network connectivity and authorization information they need to access those databases by the database administrators.
    16:45
    Lois: Maria, you mentioned the various ways to manage the lifecycle of an autonomous dedicated service. Can you tell us more about that?
    Maria: You can manage the lifecycle of an autonomous dedicated service through the Cloud UI, Command Line Interface, through our REST APIs, or through one of the several language SDKs. The lifecycle operations that you can manage include capacity planning and setup, the provisioning and partitioning of exadata infrastructure, the provisioning and management of databases, the scaling of CPU storage and other resources, the scheduling of updates for the infrastructure, the VMs, and the database, as well as monitoring through event notifications. 
    17:30
    Lois: And how do policies come into play?
    Maria: OCI allows fine-grained control over resources through the application of policies to groups. These policies are applicable to any member of the group.
    For Oracle Autonomous Database on dedicated infrastructure, the resources in question are autonomous exadata infrastructure, autonomous container databases, autonomous databases, and autonomous backups. 
    Lois: Thanks so much, Maria. That was great information.
    18:05
    The Oracle University Learning Community is a great place for you to collaborate and learn with experts, peers, and practitioners. Grow your skills, inspire innovation, and celebrate your successes. The more you participate, the more recognition you can earn. All of your activities, from liking a post to answering questions and sharing with others, will help you earn badges and ranks, and be recognized within the community.

    If you are already an Oracle MyLearn user, go to MyLearn to join the community. You will need to log in first. If you have not yet accessed Oracle MyLearn, visit mylearn.oracle.com and create an account to get started.
    18:44
    Nikita: Welcome back! Hi Kamryn, thanks for joining us on the podcast. So, in an Autonomous Database environment where most DBA tasks are automated, what exactly does an application DBA do?
    Kamryn: While Autonomous Database automates most of the repetitive tasks that DBAs perform, the application DBA will still want to monitor and diagnose databases for applications to maintain the highest performance and the greatest security possible.
    Tasks the application DBA performs includes operations on databases, cloning, movement, monitoring, and creating alerts. When required, the application DBA performs low-level diagnostics for application performance and looks for insights on performance and capacity trends. 
    19:36
    Nikita: I see. And which tools do they use for these tasks?
    Kamryn: There are several tools at the application DBA's disposal, including Enterprise Manager, Performance Hub, and the OCI Console.
    For Autonomous Dedicated, all the database operations are exposed through the console UI and available through REST API calls, including provisioning, stop/start, lifecycle operations for dedicated database types, unscheduled on-demand backups and restores, CPU scaling and storage management, providing connectivity information, including wallets, scheduling updates.
    20:17
    Lois: So, Kamryn, what tools can DBAs use for deeper exploration?
    Kamryn: For deeper exploration of the databases themselves, Autonomous Database DBAs can use SQL Developer Web, Performance Hub, and Enterprise Manager.
    20:31
    Nikita: Let’s bring Kay into the conversation. Hi Kay! With Autonomous Database Dedicated, I’ve heard that customers have more control over patching. Can you tell us a little more about that?
    Kay: With Autonomous Database Dedicated, customers get to determine the update or patching schedule if they wish. Oracle automatically manages all patching activity, but with the ADB-Dedicated service, customers have the option of customizing the patching schedule. You can specify which month in every quarter you want, which week in that month, which day in that month, and which patching window within that day. You can also dynamically change the scheduled patching date and time for a specific database if the originally scheduled time becomes inconvenient.

    21:22
    Lois: That's great! So, how often are updates published, and what options do customers have when it comes to applying these updates?
    Kay: Every quarter, updates are published to the console, and OCI notifications are sent out. ADB-Dedicated allows for greater control over updates by allowing you to choose to apply the current update or stay with the previous version and skip to the next release. And the latest update can be applied immediately. This provides fleet administrators with the option to maintain test and production systems at different patch levels.
    A fleet administrator or a database admin sets up the software version policy at the Autonomous Container Database level during provisioning, although the defaults can be modified at any time for an existing Autonomous Container Database. At the bottom of the Autonomous Exadata Infrastructure provisioning screen, you will see a Configure the Automatic Maintenance section, where you should click the Modify Schedule. 
    22:34
    Nikita: What happens if a customer doesn't customize their patching schedule?
    Kay: If you do not customize a schedule, it behaves like Autonomous Serverless, and Oracle will set a schedule for you. ADB-Dedicated customers get to choose the patching schedule that fits their business. 
    22:52
    Lois: Back to you, Kamryn, I know a bit about Transparent Data Encryption, but I'm curious to learn more. Can you tell me what it does and how it helps protect data?
    Kamryn: Transparent Data Encryption, TDE, enables you to encrypt sensitive data that you store in tables and tablespaces. After the data is encrypted, this data is transparently decrypted for authorized users or applications when they access this data. TDE helps protect data stored on media, also called data at rest. If the storage media or data file is stolen, Oracle database uses authentication, authorization, and auditing mechanisms to secure data in the database, but not in the operating system data files where data is stored. To protect these data files, Oracle database provides TDE. 
    23:45
    Nikita: That sounds important for data security. So, how does TDE protect data files?
    Kamryn: TDE encrypts sensitive data stored in data files. To prevent unauthorized decryption, TDE stores the encryption keys in a security module external to the database called a keystore.
    You can configure Oracle Key Vault as part of the TDE implementation. This enables you to centrally manage TDE key stores, called TDE wallets, in Oracle Key Vault in your enterprise. For example, you can upload a software keystore to Oracle Key Vault and then make the contents of this keystore available to other TDE-enabled databases.
    24:28
    Lois: What about Oracle Autonomous Database? How does it handle encryption?
    Kamryn: Oracle Autonomous Database uses always-on encryption that protects data at rest and in transit. All data stored in Oracle Cloud and network communication with Oracle Cloud is encrypted by default. Encryption cannot be turned off.
    By default, Oracle Autonomous Database creates and manages all the master encryption keys used to protect your data, storing them in a secure PKCS 12 keystore on the same Exadata systems where the databases reside. If your company's security policies require, Oracle Autonomous Database can instead use keys you create and manage. Customers can control key generation and rotation of the keys.
    25:19
    Kamryn: The Autonomous databases you create automatically use customer-managed keys because the Autonomous container database in which they are created is configured to use customer-managed keys. Thus, those users who create and manage Autonomous databases do not have to worry about configuring their databases to use customer-managed keys.
    25:41
    Nikita: Thank you so much, Kamryn, Kay, and Maria for taking the time to give us your insights. To learn more about provisioning Autonomous Database Dedicated resources, head over to mylearn.oracle.com and search for the Oracle Autonomous Database Administration Workshop.
    Lois: In our next episode, we will discuss Autonomous Database tools. Until then, this is Lois Houston…
    Nikita: …and Nikita Abraham signing off.
    26:07
    That’s all for this episode of the Oracle University Podcast. If you enjoyed listening, please click Subscribe to get all the latest episodes. We’d also love it if you would take a moment to rate and review us on your podcast app. See you again on the next episode of the Oracle University Podcast.