Categories
-opensource-or **OUTSIDE RESOURCE Uncategorized

Seaborn (Python Library)

https://seaborn.pydata.org

For data scientists using Python, libraries such as NumPy and Pandas are useful for reading and manipulating data on the back-end. How can we visualize the data in Python? The answer is by using a data visualization library called Seaborn. Seaborn allows data in a Python session to be visualized and is built upon another library, MatPlotLib. Seaborn is great for visualizing/graphing statistical data and works especially well with data processed through Pandas. To read a little more about Seaborn, click here to be taken to the Seaborn site.

Uses: Seaborn is a very versatile graphing library. Seaborn can graph relationships between variables, compare different distributions, and even automatically estimate and plot linear regression models for given data. Seaborn also allows for high-level abstraction code that makes graphing complex data easier. In terms of appearance, Seaborn provides many themes and color templates, reshaping and sizing and customization of how the data is displayed. For instance, any data above a threshold can be displayed one color while any data below it can be displayed another.

Documentation: Click Here 

Categories
-opensource-or **OUTSIDE RESOURCE

Pandas (Python Library)

https://pandas.pydata.org

Pandas is a Python extension library that provides definitions for operations that manipulate data sets and structures. One of the most practical uses of Pandas is the ability to import data from external files, like CSV, JSON, SQL and Excel files. After importing, Pandas can convert the raw data into a usable data frame. Pandas is a very useful tool for programmers using Python to work with data analytics. Pandas makes importing, manipulating, merging, cleaning and re-exporting data easy to do in a virtual Python environment.  Click here to learn more.

Uses: Pandas is an essential tool for data analysis in Python. Because most data comes in CSV and Excel formats, these data files must be converted into native Python in order to be readable. Pandas performs this task with ease and simplicity, creating data frames that logically and numerically organize data into rows and columns. Pandas also allows for the cleaning of data, such as the removal of unwanted columns or rows, and the merging of data, such as combining data from two different files. 

Documentation: Click Here .

Categories
-opensource-or **OUTSIDE RESOURCE python

NumPy

https://numpy.org

NumPy is one of the most useful and popular Python tool-kits available for computer scientists, programmers, and data analysts. It is an open-sourced programming extension library that enables numerical computing in Python, i.e. arithmetic functions with arrays and matrices, statistics functions like finding means and medians, and linear algebra functions like finding the determinant of matrices and finding their dot and inner products.  It is developed and improved upon in Github and overseen by its “Steering Council.” Click here to learn more.

Uses: NumPy is a widely used tool by data scientists. Use of the library allows for the use of arrays, vectors and matrices, and their respective functions and attributes (as listed above). This implementation allows for data to be collected and stored in manipulable dimensional spaces. Conceptually, NumPy bridges Python with linear algebra, allowing for the application of formulas and theorems in a Python virtual environment. These concepts allow data scientists to collect, store, manipulate and predict data in Python. These ideas are used in machine learning, artificial intelligence and countless other computer science fields. 

 Documentation: Click here.